1 INTEL 80387 PROGRAMMER'S REFERENCE MANUAL 1987
3 MARCOM DISCLAIMER -- New word: Intel Certified, iRMK, SupportNET
6 Intel Corporation makes no warranty for the use of its products and
7 assumes no responsibility for any errors which may appear in this document
8 nor does it make a commitment to update the information contained herein.
10 Intel retains the right to make changes to these specifications at any
13 Contact your local sales office to obtain the latest specifications before
16 The following are trademarks of Intel Corporation and may only be used to
17 identify Intel Products:
19 Above, BITBUS, COMMputer, CREDIT, Data Pipeline, FASTPATH, Genius, i, î,
20 ICE, iCEL, iCS, iDBP, iDIS, I²ICE, iLBX, im, iMDDX, iMMX, Inboard,
21 Insite, Intel, intel, intelBOS, Intel Certified, Intelevision,
22 inteligent Identifier, inteligent Programming, Intellec, Intellink,
23 iOSP, iPDS, iPSC, iRMK, iRMX, iSBC, iSBX, iSDM, iSXM, KEPROM, Library
24 Manager, MAPNET, MCS, Megachassis, MICROMAINFRAME, MULTIBUS, MULTICHANNEL,
25 MULTIMODULE, MultiSERVER, ONCE, OpenNET, OTP, PC BUBBLE, Plug-A-Bubble,
26 PROMPT, Promware, QUEST, QueX, Quick-Pulse Programming, Ripplemode, RMX/80,
27 RUPI, Seamless, SLD, SugarCube, SupportNET, UPI, and VLSiCEL, and the
28 combination of ICE, iCS, iRMX, iSBC, iSBX, iSXM, MCS, or UPI and a numerical
31 MDS is an ordering code only and is not used as a product name or
32 trademark. MDS(R) is a registered trademark of Mohawk Data Sciences
35 *MULTIBUS is a patented Intel bus.
36 Unix is a trademark of AT&T Bell Labs.
37 MS-DOS, XENIX, and Multiplan are trademarks of Microsoft Corporation.
38 Lotus and 1-2-3 are registered trademarks of Lotus Development Corporation.
39 SuperCalc is a registered trademark of Computer Associates International.
40 Framework is a trademark of Ashton-Tate.
41 System 370 is a trademark of IBM Corporation.
42 AT is a registered trademark of IBM Corporation.
44 Additional copies of this manual or other Intel literature may be obtained
48 Literature Distribution
53 (c)INTEL CORPORATION 1987 CG-5/26/87
58 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
60 Customer Support is Intel's complete support service that provides Intel
61 customers with hardware support, software support, customer training, and
62 consulting services. For more information contact your local sales offices.
64 After a customer purchases any system hardware or software product,
65 service and support become major factors in determining whether that
66 product will continue to meet a customer's expectations. Such support
67 requires an international support organization and a breadth of programs
68 to meet a variety of customer needs. As you might expect, Intel's customer
69 support is quite extensive. It includes factory repair services and
70 worldwide field service offices providing hardware repair services,
71 software support services, customer training classes, and consulting
74 Hardware Support Services
76 Intel is committed to providing an international service support package
77 through a wide variety of service offerings available from Intel Hardware
80 Software Support Services
82 Intel's software support consists of two levels of contracts. Standard
83 support includes TIPS (Technical Information Phone Service), updates and
84 subscription service (product-specific troubleshooting guides and COMMENTS
85 Magazine). Basic support includes updates and the subscription service.
86 Contracts are sold in environments which represent product groupings
87 (i.e., iRMX environment).
91 Intel provides field systems engineering services for any phase of your
92 development or support effort. You can use our systems engineers in a
93 variety of ways ranging from assistance in using a new product, developing
94 an application, personalizing training, and customizing or tailoring an
95 Intel product to providing technical and management consulting. Systems
96 Engineers are well versed in technical areas such as microcommunications,
97 real-time applications, embedded microcontrollers, and network services.
98 You know your application needs; we know our products. Working together we
99 can help you get a successful product to market in the least possible time.
103 Intel offers a wide range of instructional programs covering various
104 aspects of system design and implementation. In just three to ten days a
105 limited number of individuals learn more in a single workshop than in
106 weeks of self-study. For optimum convenience, workshops are scheduled
107 regularly at Training Centers woridwide or we can take our workshops to
108 you for on-site instruction. Covering a wide variety of topics, Intel's
109 major course categories include: architecture and assembly language,
110 programming and operating systems, bitbus and LAN applications.
112 Training Center Locations
114 To obtain a complete catalog of our workshops, call the nearest Training
117 Boston (617) 692-1000
118 Chicago (312) 310-5700
119 San Francisco (415) 940-7800
120 Washington D.C. (301) 474-2878
121 Isreal (972) 349-491-099
123 Osaka (Call Tokyo) 03-437-6611
124 Toronto, Canada (416) 675-2105
125 London (0793) 696-000
128 Stockholm (468) 734-01-00
130 Benelux (Rotterdam) (10) 21-23-77
131 Copenhagen (1) 198-033
137 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
139 This manual describes the 80387 Numeric Processor Extension (NPX) for the
140 80386 microprocessor. Understanding the 80387 requires an understanding of
141 the 80386; therefore, a brief overview of 80386 concepts is presented first.
142 A detailed discussion of the 80386 microprocessor can be found in the 80386
143 Programmer's Reference Manual.
145 The 80386 Microsystem
147 The 80386 is the basis of a new VLSI microprocessor system with exceptional
148 capabilities for supporting large-system applications. This powerful
149 microsystem is designed to support multiuser reprogrammable and real-time
150 multitasking applications. Its dedicated system support circuits simplify
151 system hardware; sophisticated hardware and software tools reduce both the
152 time and the cost of product development. The 80386 microsystem offers a
153 total-solution approach, enabling you to develop high-speed, interactive,
154 multiuser, multitasking‘‘even multiprocessor‘‘systems more rapidly and at
155 higher performance than ever before.
157 Ž Reliability and system up-time are becoming increasingly important in
158 all applications. Information must be protected from misuse or
159 accidental loss. The 80386 includes a sophisticated and flexible
160 four-level protection mechanism that can isolate layers of operating
161 system programs from application programs to maintain a high degree of
164 Ž The 80386 addresses up to 4 gigabytes of physical memory to support
165 today's application requirements. This large physical memory enables
166 the 80386 to keep many large programs and data structures
167 simultaneously in memory for high-speed access.
169 Ž For applications with dynamically changing memory requirements, such
170 as multiuser business systems, the 80386 CPU provides on-chip memory
171 management and virtual memory support. On an 80386-based system, each
172 user can have up to 64 terabytes of virtual-address space. This large
173 address space virtually eliminates restrictions on the size of programs
174 that may be part of the system. The memory management features are
175 subject to control of systems software; therefore, systems software
176 designers can choose among a variety of memory-organization models.
177 Systems designers can choose to view memory in terms of fixed-length
178 pages, in terms of variable length segments, or as a combination of
179 pages and segments. The sizes of segments can range from one byte to 4
180 gigabytes. Virtual memory can be implemented either at the level of
181 segments or at the level of pages.
183 Ž Large multiuser or real-time multitasking systems are easily supported
184 by the 80386. High-performance features, such as a very high-speed task
185 switch, fast interrupt-response time, intertask protection,
186 page-oriented virtual memory, and a quick and direct operating system
187 interface, make the 80386 highly suited to multiuser/multitasking
190 Ž The 80386 has two primary operating modes: real-address mode and
191 protected mode. In real-address mode, the 80386/80387 is fully upward
192 compatible from the 8086, 8088, 80186, and 80188 microprocessors and
193 from the 80286 real-address mode; all of the extensive libraries of
194 8086 and 8088 software execute 15 to 20 times faster on the 80386,
195 without any modification.
197 Ž In protected-address mode, the advanced memory management
198 and protection features of the 80386 become available, without any
199 reduction in performance. Upgrading 8086 and 8088 application
200 programs to use these new memory management and protection features
201 usually requires only reassembly or recompilation (some programs may
202 require minor modification). Entire 80286 protected-mode applications
203 can run in this mode without modification.
205 Ž The virtual-8086 mode of the 80386 is available when the primary mode
206 is protected mode. Virtual-8086 mode enables direct execution of
207 multiple 8086/8088 programs within a protected-mode environment. Most
208 8086 and 8088 application programs can be executed in this environment
209 without alteration (refer to the 80386 Programmer's Reference Manual
210 for differences from 8086). This high degree of compatibility between
211 80386 and earlier members of the 8086 processor family reduces both
212 the time and the cost of software development.
214 The Organization of This Manual
216 This manual describes the 80387 Numeric Processor Extension (NPX) for the
217 80386 microprocessor. The material in this manual is presented from the
218 perspective of software designers, both at an applications and at a systems
221 Ž Chapter 1, "Introduction to the 80387 Numerics Processor Extension,"
222 gives an overview of the 80387 NPX and reviews the concepts of numeric
223 computation using the 80387.
225 Ž Chapter 2, "80387 Numerics Processor Architecture," presents the
226 registers and data types of the 80387 to both applications and systems
229 Ž Chapter 3, "Special Computational Situations," discusses the special
230 values that can be represented in the 80387's real formats‘‘denormal
231 numbers, zeros, infinities, NaNs (not a number)‘‘as well as numerics
232 exceptions. This chapter should be read thoroughly by systems
233 programmers, but may be skimmed by applications programmers. Many of
234 these special values and exceptions may never occur in applications
237 Ž Chapter 4, "80387 Instruction Set," provides functional information
238 for software designers generating applications for systems containing
239 an 80386 CPU with an 80387 NPX. The 80386/80387 instruction set
240 mnemonics are explained in detail.
242 Ž Chapter 5, "Programming Numeric Applications," provides a description
243 of programming facilities for 80386/80387 systems. A comparative 80387
244 programming example is given.
246 Ž Chapter 6, "System-Level Numeric Programming," provides information of
247 interest to systems software writers, including details of the 80387
248 architecture and operational characteristics.
250 Ž Chapter 7, "Numeric Programming Examples," provides several detailed
251 programming examples for the 80387, including conditional branching,
252 the conversion betweenfloating-point values and their ASCII
253 representations, and the use of trigonometric functions. These examples
254 illustrate assembly-language programming on the 80387 NPX.
256 Ž Appendix A, "Machine Instruction Encoding and Decoding," gives
257 reference information on the encoding of NPX instructions. This
258 information is useful to writers of debuggers, exception handlers, and
261 Ž Appendix B, "Exception Summary," provides a list of the exceptions
262 that each instruction can cause. This list is valuable to both
263 applications and systems programmers.
265 Ž Appendix C, "Compatability between the 80387 and the 80287/8087,"
266 describes the differences from the 80387 that are common to the 80287
269 Ž Appendix D, "Compatability between the 80387 and the 8087," describes
270 the additional differences between the 80387 and the 8087 that are of
271 concern when porting 8086/8087 programs directly to the 80386/80387.
274 Please consult the most recent 80387 data sheet for these specifications, "80387 80-Bit CHMOS III Numeric Processor Extension,"
275 reproduces a data sheet of 80387 specifications that is separately
276 available. The table of instruction timings in this appendix will be of
277 interest to many readers of this manual. (The AC specifications have
278 been deliberately left out.) The specifications in data sheets are
279 subject to change; consult the most recent data sheet for design-in
282 Ž Appendix F, "PC/AT-Compatible 80387 Connection," documents a
283 nonstandard method of connecting an 80387 to an 80386 to achieve
284 compatibility with the IBM PC/AT.
286 Ž The Glossary defines 80387 and floating-point terminology. Refer to it
291 To best use the material in this manual, readers should be familiar with
292 the operation and architecture of 80386 systems. The following manuals
293 contain information related to the content of this manual and of interest to
294 programmers of 80387 systems:
296 Ž Introduction to the 80386, order number 231252
297 Ž 80386 Data Sheet, order number 231630
298 Ž 80386 Hardware Reference Manual, order number 231732
299 Ž 80386 Programmer's Reference Manual, order number 230985
300 Ž 80387 Data Sheet, order number 231920
303 Notational Conventions
305 This manual uses special notation to represent sub and superscript
306 characters. Subscript characters are surrounded by {curly brackets}, for
307 example 10{2} = 10 base 2. Superscript characters are preceeded by a caret
308 and enclosed within (parentheses), for example 10^(3) = 10 to the third
314 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
316 Chapter 1 Introduction to the 80387 Numerics Processor Extension
323 1.6 Programming Interface
325 Chapter 2 80387 Numerics Processor Architecture
328 2.1.1 The NPX Register Stack
329 2.1.2 The NPX Status Word
331 2.1.4 The NPX Tag Word
332 2.1.5 The NPX Instruction and Data Pointers
334 2.2 Computation Fundamentals
336 2.2.2 Data Types and Formats
337 2.2.2.1 Binary Integers
338 2.2.2.2 Decimal Integers
341 2.2.3 Rounding Control
342 2.2.4 Precision Control
344 Chapter 3 Special Computational Situations
346 3.1 Special Numeric Values
347 3.1.1 Denormal Real Numbers
348 3.1.1.1 Denormals and Gradual Underflow
352 3.1.4 NaN (Not-a-Number)
353 3.1.4.1 Signaling NaNs
357 3.1.6 Encoding of Data Types
358 3.1.7 Unsupported Formats
360 3.2 Numeric Exceptions
361 3.2.1 Handling Numeric Exceptions
362 3.2.1.1 Automatic Exception Handling
363 3.2.1.2 Software Exception Handling
365 3.2.2 Invalid Operation
366 3.2.2.1 Stack Exception
367 3.2.2.2 Invalid Arithmetic Operation
369 3.2.3 Division by Zero
370 3.2.4 Denormal Operand
371 3.2.5 Numeric Overflow and Underflow
375 3.2.6 Inexact (Precision)
376 3.2.7 Exception Priority
377 3.2.8 Standard Underflow/Overflow Exception Handler
379 Chapter 4 The 80387 Instruction Set
381 4.1 Compatibility with the 80287 and 8087
383 4.3 Data Transfer Instructions
385 4.3.2 FST destination
386 4.3.3 FSTP destination
387 4.3.4 FXCH//destination
389 4.3.6 FIST destination
390 4.3.7 FISTP destination
392 4.3.9 FBSTP destination
394 4.4 Nontranscendental Instructions
396 4.4.2 Normal Subtraction
397 4.4.3 Reversed Subtraction
399 4.4.5 Normal Division
400 4.4.6 Reversed Division
403 4.4.9 FPREM---Partial Remainder (80287/8087-Compatible)
404 4.4.10 FPREM1---Partial Remainder (IEEE Std. 754-Compatible)
410 4.5 Comparison Instructions
422 4.6 Transcendental Instructions
432 4.7 Constant Instructions
441 4.8 Processor Control Instructions
444 4.8.3 FSTCW/FNSTCW destination
445 4.8.4 FSTSW/FNSTSW destination
446 4.8.5 FSTSW AX/FNSTSW AX
448 4.8.7 FSAVE/FNSAVE destination
450 4.8.9 FSTENV/FNSTENV destination
454 4.8.13 FFREE destination
456 4.8.15 FWAIT (CPU Instruction)
458 Chapter 5 Programming Numeric Applications
460 5.1 Programming Facilities
461 5.1.1 High-Level Languages
465 5.1.4.1 Defining Data
466 5.1.4.2 Records and Structures
467 5.1.4.3 Addressing Methods
469 5.1.5 Comparative Programming Example
470 5.1.6 80387 Emulation
472 5.2 Concurrent Processing with the 80387
473 5.2.1 Managing Concurrency
474 5.2.1.1 Incorrect Exception Synchronization
475 5.2.1.2 Proper Exception Synchronization
477 Chapter 6 System-Level Numeric Programming
479 6.1 80386/80387 Architecture
480 6.1.1 Instruction and Operand Transfer
481 6.1.2 Independent of CPU Addressing Modes
482 6.1.3 Dedicated I/O Locations
484 6.2 Processor Initialization and Control
485 6.2.1 System Initialization
486 6.2.2 Hardware Recognition of the NPX
487 6.2.3 Software Recognition of the NPX
488 6.2.4 Configuring the Numerics Environment
489 6.2.5 Initializing the 80387
490 6.2.6 80387 Emulation
491 6.2.7 Handling Numerics Exceptions
492 6.2.8 Simultaneous Exception Response
493 6.2.9 Exception Recovery Examples
495 Chapter 7 Numeric Programming Examples
497 7.1 Conditional Branching Example
498 7.2 Exception Handling Examples
499 7.3 Floating-Point to ASCII Conversion Examples
500 7.3.1 Function Partitioning
501 7.3.2 Exception Considerations
502 7.3.3 Special Instructions
503 7.3.4 Description of Operation
504 7.3.5 Scaling the Value
505 7.3.5.1 Inaccuracy in Scaling
506 7.3.5.2 Avoiding Underflow and Overflow
507 7.3.5.3 Final Adjustments
511 7.4 Trigonometric Calculation Examples (Not Tested)
513 Appendix A Machine Instruction Encoding and Decoding
515 Appendix B Exception Summary
517 Appendix C Compatibility Between the 80387 and the 80287/8087
519 Appendix D Compatibility Between the 80387 and the 8087
521 Appendix E 80387 80-Bit CHMOS III Numeric Processor Extension
523 Appendix F PC/AT-Compatible 80387 Connection
525 Glossary of 80387 and Floating-Point Terminology
530 1-1 Evolution and Performance of Numeric Processors
532 2-1 80387 Register Set
533 2-2 80387 Status Word
534 2-3 80387 Control Word Format
535 2-4 80387 Tag Word Format
536 2-5 Protected Mode 80387 Instruction and Data Pointer Image in Memory,
538 2-6 Real Mode 80387 Instruction and Data Pointer Image in Memory,
540 2-7 Protected Mode 80387 Instruction and Data Pointer Image in Memory,
542 2-8 Real Mode 80387 Instruction and Data Pointer Image in Memory,
544 2-9 80387 Double-Precision Number System
545 2-10 80387 Data Formats
547 3-1 Floating-Point System with Denormals
548 3-2 Floating-Point System without Denormals
549 3-3 Arithmetic Example Using Infinity
551 4-1 FSAVE/FRSTOR Memory Layout (32-Bit)
552 4-2 FSAVE/FRSTOR Memory Layout (16-Bit)
553 4-3 Protected Mode 80387 Environment, 32-Bit Format
554 4-4 Real Mode 80387 Environment, 32-Bit Format
555 4-5 Protected Mode 80387 Environment, 16-Bit Format
556 4-6 Real Mode 80387 Environment, 16-Bit Format
558 5-1 Sample C-386 Program
559 5-2 Sample 80387 Constants
560 5-3 Status Word Record Definition
561 5-4 Structure Definition
562 5-5 Sample PL/M-386 Program
563 5-6 Sample ASM386 Program
564 5-7 Instructions and Register Stack
565 5-8 Exception Synchronization Examples
567 6-1 Software Routine to Recognize the 80287
569 7-1 Conditional Branching for Compares
570 7-2 Conditional Branching for FXAM
571 7-3 Full-State Exception Handler
572 7-4 Reduced-Latency Exception Handler
573 7-5 Reentrant Exception Handler
574 7-6 Floating-Point to ASCII Conversion Routine
576 See page 7-22 in the printed version of this manual Relationships between Adjacent Joints
577 7-8 Robot Arm Kinematics Example
582 1-1 Numeric Processing Speed Comparisons
583 1-2 Numeric Data Types
584 1-3 Principal NPX Instructions
586 2-1 Condition Code Interpretation
587 2-2 Correspondence between 80387 and 80386 Flag Bits
588 2-3 Summary of Format Parameters
589 2-4 Real Number Notation
592 3-1 Arithmetic and Nonarithmetic Instructions
593 3-2 Denormalization Process
594 3-3 Zero Operands and Results
595 3-4 Infinity Operands and Results
596 3-5 Rules for Generating QNaNs
597 3-6 Binary Integer Encodings
598 3-7 Packed Decimal Encodings
599 3-8 Single and Double Real Encodings
600 3-9 Extended Real Encodings
601 3-10 Masked Responses to Invalid Operations
602 3-11 Masked Overflow Results
604 4-1 Data Transfer Instructions
605 4-2 Nontranscendental Instructions
606 4-3 Basic Nontranscendental Instructions and Operands
607 4-4 Condition Code Interpretation after FPREM and FPREM
609 4-5 Comparison Instructions
610 4-6 Condition Code Resulting from Comparisons
611 4-7 Condition Code Resulting from FTST
612 4-8 Condition Code Defining Operand Class
613 4-9 Transcendental Instructions
614 4-10 Results of FPATAN
615 4-11 Constant Instructions
616 4-12 Processor Control Instructions
618 5-1 PL/M-386 Built-In Procedures
619 5-2 ASM386 Storage Allocation Directives
620 5-3 Addressing Method Examples
622 6-1 NPX Processor State Following Initialization
625 Chapter 1 Introduction to the 80387 Numerics Processor Extension
627 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
629 The 80387 NPX is a high-performance numerics processing element that
630 extends the 80386 architecture by adding significant numeric capabilities
631 and direct support for floating-point, extended-integer, and BCD data types.
632 The 80386 CPU with 80387 NPX easily supports powerful and accurate numeric
633 applications through its implementation of the IEEE Standard 754 for Binary
634 Floating-Point Arithmetic. The 80387 provides floating-point performance
635 comparable to that of large minicomputers while offering compatibility with
636 object code for 8087 and 80287.
641 The 80387 Numeric Processor Extension (NPX) is compatible with its
642 predecessors, the earlier Intel 8087 NPX and 80287 NPX. As the 80386 runs
643 8086 programs, so programs designed to use the 8087 and 80287 should run
644 unchanged on the 80387.
646 The 8087 NPX was designed for use in 8086-family systems. The 8086 was the
647 first microprocessor family to partition the processing unit to permit
648 high-performance numeric capabilities. The 8087 NPX for this processor
649 family implemented a complete numeric processing environment in compliance
650 with an early proposal for the IEEE 754 Floating-Point Standard.
652 With the 80287 Numeric Processor Extension, high-speed numeric computations
653 were extended to 80286 high-performance multitasking and multiuser systems.
654 Multiple tasks using the numeric processor extension were afforded the full
655 protection of the 80286 memory management and protection features.
657 The 80387 Numeric Processor Extension is Intel's third generation numerics
658 processor. The 80387 implements the final IEEE standard, adds new
659 trigonometric instructions, and uses a new design and CHMOS-III process to
660 allow higher clock rates and require fewer clocks per instruction. Together,
661 the 80387 with additional instructions and the improved standard bring even
662 more convenience and reliability to numerics programming and make this
663 convenience and reliability available to applications that need the
664 high-speed and large memory capacity of the 32-bit environment of the 80386
667 Figure 1-1 illustrates the relative performance of 5-MHz 8086/8087,
668 8-MHz 80286/80287, and 20-MHz 80386/80387 systems in executing
669 numerics-oriented applications.
672 Figure 1-1. Evolution and Performance of Numeric Processors
674 16� 80386/80387 (20 MHz)
687 3� 80286/80287 (8 MHz)
690 ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
698 Table 1-1 compares the execution times of several 80387 instructions with
699 the equivalent operations executed on an 8-MHz 80287. As indicated in the
700 table, the 16-MHz 80387 NPX provides about 5 to 6 times the performance of
701 an 8-MHz 80287 NPX. A 16-MHz 80387 multiplies 32-bit and 64-bit
702 floating-point numbers in about 1.9 and 2.8 microseconds, respectively. Of
703 course, the actual performance of the NPX in a given system depends on the
704 characteristics of the individual application.
706 Although the performance figures shown in Table 1-1 refer to operations on
707 real (floating-point) numbers, the 80387 also manipulates fixed-point
708 binary and decimal integers of up to 64 bits or 18 digits, respectively. The
709 80387 can improve the speed of multiple-precision software algorithms for
710 integer operations by 10 to 100 times.
712 Because the 80387 NPX is an extension of the 80386 CPU, no software
713 overhead is incurred in setting up the NPX for computation. The 80387 and
714 80386 processors coordinate their activities in a manner transparent to
715 software. Moreover, built-in coordination facilities allow the 80386 CPU to
716 proceed with other instructions while the 80387 NPX is simultaneously
717 executing numeric instructions. Programs can exploit this concurrency of
718 execution to further increase system performance and throughput.
721 Table 1-1. Numeric Processing Speed Comparisons
723 Approximate Performance Ratios:
724 Floating-Point Instruction 16 MHz 80386/80387 ÷
725 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“ 8 MHz 80286/80287
727 FADD ST, ST(i) Addition 6.2
728 FDIV dword_var Division 4.7
729 FYL2X stack (0), (1) assumed Logarithm 6.0
730 FPATAX stack (0) assumed Arctangent 2.6
731 The ratio is higher if the operand is not in range of the 80287
733 F2XM1 stack (0) assumed Exponentiation 2.7
734 The ratio is higher if the operand is not in range of the 80287
740 The 80387 NPX offers more than raw execution speed for
741 computation-intensive tasks. The 80387 brings the functionality and power of
742 accurate numeric computation into the hands of the general user. These
743 features are available in most high-level languages available for the 80386.
745 Like the 8087 and 80287 that preceded it, the 80387 is explicitly designed
746 to deliver stable, accurate results when programmed using straightforward
747 "pencil and paper" algorithms. The IEEE standard 754 specifically addresses
748 this issue, recognizing the fundamental importance of making numeric
749 computations both easy and safe to use.
751 For example, most computers can overflow when two single-precision
752 floating-point numbers are multiplied together and then divided by a third,
753 even if the final result is a perfectly valid 32-bit number. The 80387
754 delivers the correctly rounded result. Other typical examples of undesirable
755 machine behavior in straightforward calculations occur when computing
756 financial rate of return, which involves the expression (1 + i)^(n) or when
757 solving for roots of a quadratic equation:
763 If a does not equal 0, the formula is numerically unstable when the roots
764 are nearly coincident or when their magnitudes are wildly different. The
765 formula is also vulnerable to spurious over/underflows when the coefficients
766 a, b, and c are all very big or all very tiny. When single-precision
767 (4-byte) floating-point coefficients are given as data and the formula is
768 evaluated in the 80387's normal way, keeping all intermediate results in
769 its stack, the 80387 produces impeccable single-precision roots. This
770 happens because, by default and with no effort on the programmer's part, the
771 80387 evaluates all those subexpressions with so much extra precision and
772 range as to overwhelm any threat to numerical integrity.
774 If double-precision data and results were at issue, a better formula would
775 have to be used, and once again the 80387's default evaluation of that
776 formula would provide substantially enhanced numerical integrity over mere
777 double-precision evaluation.
779 On most machines, straightforward algorithms will not deliver consistently
780 correct results (and will not indicate when they are incorrect). To obtain
781 correct results on traditional machines under all conditions usually
782 requires sophisticated numerical techniques that are foreign to most
783 programmers. General application programmers using straightforward
784 algorithms will produce much more reliable programs using the 80387. This
785 simple fact greatly reduces the software investment required to develop
786 safe, accurate computation-based products.
788 Beyond traditional numerics support for scientific applications, the 80387
789 has built-in facilities for commercial computing. It can process decimal
790 numbers of up to 18 digits without round-off errors, performing exact
791 arithmetic on integers as large as 2^(64) or 10^(18). Exact arithmetic is
792 vital in accounting applications where rounding errors may introduce
793 monetary losses that cannot be reconciled.
795 The NPX contains a number of optional facilities that can be invoked by
796 sophisticated users. These advanced features include directed rounding,
797 gradual underflow, and programmed exception-handling facilities.
799 These automatic exception-handling facilities permit a high degree of
800 flexibility in numeric processing software, without burdening the
801 programmer. While performing numeric calculations, the NPX automatically
802 detects exception conditions that can potentially damage a calculation (for
803 example, X ÷ 0 or ¹X when X < 0). By default, on-chip exception logic
804 handles these exceptions so that a reasonable result is produced and
805 execution may proceed without program interruption. Alternatively, the NPX
806 can signal the CPU, invoking a software exception handler to provide special
807 results whenever various types of exceptions are detected.
812 The 80386's versatility and performance make it appropriate to a broad
813 array of numeric applications. In general, applications that exhibit any of
814 the following characteristics can benefit by implementing numeric processing
817 Ž Numeric data vary over a wide range of values, or include nonintegral
820 Ž Algorithms produce very large or very small intermediate results.
822 Ž Computations must be very precise; i.e., a large number of significant
823 digits must be maintained.
825 Ž Performance requirements exceed the capacity of traditional
828 Ž Consistently safe, reliable results must be delivered using a
829 programming staff that is not expert in numerical techniques.
831 Note also that the 80387 can reduce software development costs and improve
832 the performance of systems that use not only real numbers, but operate on
833 multiprecision binary or decimal integer values as well.
835 A few examples, which show how the 80387 might be used in specific numerics
836 applications, are described below. In many cases, these types of systems
837 have been implemented in the past with minicomputers or small mainframe
838 computers. The advent of the 80387 brings the size and cost savings of
839 microprocessor technology to these applications for the first time.
841 Ž Business data processing‘‘The NPX's ability to accept decimal operands
842 and produce exact decimal results of up to 18 digits greatly simplifies
843 accounting programming. Financial calculations that use power functions
844 can take advantage of the 80387's exponentiation and logarithmic
845 instructions. Many business software packages can benefit from the
846 speed and accuracy of the 80387; for example, Lotus* 1-2-3*,
847 Multiplan*, SuperCalc*, and Framework*.
849 Ž Simulation‘‘The large (32-bit) memory space of the 80386 coupled with
850 the raw speed of the 80386 and 80387 processors make 80386/80387
851 microsystems suitable for attacking large simulation problems, which
852 heretofore could only be executed on expensive mini and mainframe
853 computers. For example, complex electronic circuit simulations using
854 SPICE can now be performed on a microcomputer, the 80386/80387.
855 Simulation of mechanical systems using finite element analysis can
856 employ more elements, resulting in more detailed analysis or simulation
859 Ž Graphics transformations‘‘The 80387 can be used in graphics terminals
860 to locally perform many functions that normally demand the attention of
861 a main computer; these include rotation, scaling, and interpolation. By
862 also using an 82786 Graphics Display Controller to perform high-speed
863 drawing and window management, very powerful and highly self-sufficient
864 terminals can be built from a relatively small number of 80386 family
867 Ž Process control‘‘The 80387 solves dynamic range problems
868 automatically, and its extended precision allows control functions to
869 be fine-tuned for more accurate and efficient performance. Control
870 algorithms implemented with the NPX also contribute to improved
871 reliability and safety, while the 80387's speed can be exploited in
872 real-time operations.
874 Ž Computer numerical control (CNC)‘‘The 80387 can move and position
875 machine tool heads with accuracy in real-time. Axis positioning also
876 benefits from the hardware trigonometric support provided by the 80387.
878 Ž Robotics‘‘Coupling small size and modest power requirements with
879 powerful computational abilities, the 80387 is ideal for on-board
880 six-axis positioning.
882 Ž Navigation‘‘Very small, lightweight, and accurate inertial guidance
883 systems can be implemented with the 80387. Its built-in trigonometric
884 functions can speed and simplify the calculation of position from
887 Ž Data acquisition‘‘The 80387 can be used to scan, scale, and reduce
888 large quantities of data as it is collected, thereby lowering storage
889 requirements and time required to process the data for analysis.
891 The preceding examples are oriented toward traditional numerics
892 applications. There are, in addition, many other types of systems that do
893 not appear to the end user as computational, but can employ the 80387 to
894 advantage. Indeed, the 80387 presents the imaginative system designer with
895 an opportunity similar to that created by the introduction of the
896 microprocessor itself. Many applications can be viewed as numerically-based
897 if sufficient computational power is available to support this view (e.g.,
898 character generation for a laser printer). This is analogous to the
899 thousands of successful products that have been built around "buried"
900 microprocessors, even though the products themselves bear little
901 resemblance to computers.
906 The architecture of the 80386 CPU is specifically adapted to allow easy
907 upgradability to use an 80387, simply by plugging in the 80387 NPX. For this
908 reason, designers of 80386 systems may wish to incorporate the 80387 NPX
909 into their designs in order to offer two levels of price and performance at
910 little additional cost.
912 Two features of the 80386 CPU make the design and support of upgradable
913 80386 systems particularly simple:
915 Ž The 80386 can be programmed to recognize the presence of an 80387 NPX;
916 that is, software can recognize whether it is running on an 80386 with
917 or without an 80387 NPX.
919 Ž After determining whether the 80387 NPX is available, the 80386 CPU
920 can be instructed to let the NPX execute all numeric instructions. If
921 an 80387 NPX is not available, the 80386 CPU can emulate all 80387
922 numeric instructions in software. This emulation is completely
923 transparent to the application software‘‘the same object code may be
924 used by 80386 systems both with and without an 80387 NPX. No relinking
925 or recompiling of application software is necessary; the same code will
926 simply execute faster with the 80387 NPX than without.
928 To facilitate this design of upgradable 80386 systems, Intel provides a
929 software emulator for the 80387 that provides the functional equivalent of
930 the 80387 hardware, implemented in software on the 80386. Except for timing,
931 the operation of this 80387 emulator (EMUL387) is the same as for the 80387
932 NPX hardware. When the emulator is combined as part of the systems software,
933 the 80386 system with 80387 emulation and the 80386 with 80387 hardware are
934 virtually indistinguishable to an application program. This capability
935 makes it easy for software developers to maintain a single set of programs
936 for both systems. System manufacturers can offer the NPX as a simple plug-in
937 performance option without necessitating any changes in the user's software.
940 1.6 Programming Interface
942 The 80386/80387 pair is programmed as a single processor; all of the 80387
943 registers appear to a programmer as extensions of the basic 80386 register
944 set. The 80386 has a class of instructions known as ESCAPE instructions, all
945 having a common format. These ESC instructions are numeric instructions for
946 the 80387 NPX. These numeric instructions for the 80387 are simply encoded
947 into the instruction stream along with 80386 instructions.
949 All of the CPU memory-addressing modes may be used in programming the NPX,
950 allowing convenient access to record structures, numeric arrays, and other
951 memory-based data structures. All of the memory management and protection
952 features of the CPU (both paging and segmentation) are extended to the NPX
955 Numeric processing in the 80387 centers around the NPX register stack.
956 Programmers can treat these eight 80-bit registers either as a fixed
957 register set, with instructions operating on explicitly-designated
958 registers, or as a classical stack, with instructions operating on the top
959 one or two stack elements.
961 Internally, the 80387 holds all numbers in a uniform 80-bit extended
962 format. Operands that may be represented in memory as 16-, 32-, or 64-bit
963 integers, 32-, 64-, or 80-bit floating-point numbers, or 18-digit packed BCD
964 numbers, are automatically converted into extended format as they are loaded
965 into the NPX registers. Computation results are subsequently converted back
966 into one of these destination data formats when they are stored into memory
967 from the NPX registers.
969 Table 1-2 lists each of the seven data types supported by the 80387,
970 showing the data format for each type. All operands are stored in memory
971 with the least significant digits starting at the initial (lowest) memory
972 address. Numeric instructions access and store memory operands using only
973 this initial address. For maximum system performance, all operands should
974 start at memory addresses divisible by four.
976 Table 1-3 lists the 80387 instructions by class. No special programming
977 tools are necessary to use the 80387, because all of the NPX instructions
978 and data types are directly supported by the ASM386 Assembler, by high-level
979 languages from Intel, and by assemblers and compilers produced by many
980 independent software vendors. Software routines for the 80387 may be written
981 in ASM386 Assembler or any of the following higher-level languages from
987 In addition, all of the development tools supporting the 8086/8087 and
988 80286/80287 can also be used to develop software for the 80386/80387.
990 All of these high-level languages provide programmers with access to the
991 computational power and speed of the 80387 without requiring an
992 understanding of the architecture of the 80386 and 80387 chips. Such
993 architectural considerations as concurrency and synchronization are handled
994 automatically by these high-level languages. For the ASM386 programmer,
995 specific rules for handling these issues are discussed in a later section
998 The following operating systems are known or expected to support the
999 80387: RMX-286/386, MS-DOS, Xenix-286/386, and Unix-286/386. Advanced
1000 in-circuit debugging support is provided by ICE-386.
1003 Table 1-2. Numeric Data Types
1005 Data Type Bits Significant Approximate Range (Decimal)
1009 Word integer 16 4 -32,768 ¾ X ¾ +32,767
1010 Short integer 32 9 -2*10^(9) ¾ X ¾ +2*10^(9)
1011 Long integer 64 18 -9*10^(18) ¾ X ¾ +9*10^(18)
1012 Packed decimal 80 18 -99...99 ¾ X ¾ +99...99 (18 digits)
1013 Single real 32 6-7 1.18*10^(-38) ¾ �X� ¾ 3.40*10^(38)
1014 Double real 64 15-16 2.23*10^(-308) ¾ �X� ¾ 1.80*10^(308)
1016 Equivalent to double extended format of IEEE Std 754 80 19 3.30*10^(-4932) ¾ �X� ¾ 1.21*10^(4932)
1019 Table 1-3. Principal NPX Instructions
1021 Class Instruction Types
1023 Data Transfer Load (all data types), Store (all data types), Exchange
1025 Arithmetic Add, Subtract, Multiply, Divide, Subtract Reversed,
1026 Divide Reversed, Square Root, Scale, Remainder, Integer
1027 Part, Change Sign, Absolute Value, Extract
1029 Comparison Compare, Examine, Test
1031 Transcendental Tangent, Arctangent, Sine, Cosine, Sine and Cosine,
1032 2^(x) - 1, Y * Log{2}(X), Y * Log{2}(X+1)
1034 Constants 0, 1, Ò, Log{10}2, Log{e}2, Log{2}10, Log{2}e
1036 Processor Control Load Control Word, Store Control Word, Store Status
1037 Word, Load Environment, Store Environment, Save,
1038 Restore, Clear Exceptions, Initialize
1042 Class Instruction Types
1044 Data Transfer Load (all data types), Store (all data types), Exchange
1046 Arithmetic Add, Subtract, Multiply, Divide, Subtract Reversed,
1047 Divide Reversed, Square Root, Scale, Remainder, Integer
1048 Part, Change Sign, Absolute Value, Extract
1050 Comparison Compare, Examine, Test
1052 Transcendental Tangent, Arctangent, Sine, Cosine, Sine and Cosine,
1053 2^(x) - 1, Y * Log{2}(X), Y * Log{2}(X+1)
1055 Constants 0, 1, Ò, Log{10}2, Log{e}2, Log{2}10, Log{2}e
1057 Processor Control Load Control Word, Store Control Word, Store Status
1058 Word, Load Environment, Store Environment, Save,
1059 Restore, Clear Exceptions, Initialize
1062 Chapter 2 80387 Numerics Processor Architecture
1064 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1066 To the programmer, the 80387 NPX appears as a set of additional registers,
1067 data types, and instructions‘‘all of which complement those of the 80386.
1068 Refer to Chapter 4 for detailed explanations of the 80387 instruction set.
1069 This chapter explains the new registers and data types that the 80387 brings
1070 to the architecture of the 80386.
1075 The additional registers consist of
1077 Ž Eight individually-addressable 80-bit numeric registers, organized as
1080 Ž Three sixteen-bit registers containing:
1083 the NPX control word
1086 Ž Two 48-bit registers containing pointers to the current instruction
1087 and operand (these registers are actually located in the 80386)
1089 All of the NPX numeric instructions focus on the contents of these NPX
1093 2.1.1 The NPX Register Stack
1095 The 80387 register stack is shown in Figure 2-1. Each of the eight numeric
1096 registers in the 80387's register stack is 80 bits wide and is divided into
1097 fields corresponding to the NPX's extended real data type.
1099 Numeric instructions address the data registers relative to the register on
1100 the top of the stack. At any point in time, this top-of-stack register is
1101 indicated by the TOP (stack TOP) field in the NPX status word. Load or push
1102 operations decrement TOP by one and load a value into the new top register.
1103 A store-and-pop operation stores the value from the current TOP register and
1104 then increments TOP by one. Like 80386 stacks in memory, the 80387 register
1105 stack grows down toward lower-addressed registers.
1107 Many numeric instructions have several addressing modes that permit the
1108 programmer to implicitly operate on the top of the stack, or to explicitly
1109 operate on specific registers relative to the TOP. The ASM386 Assembler
1110 supports these register addressing modes, using the expression ST(0), or
1111 simply ST, to represent the current Stack Top and ST(i) to specify the ith
1112 register from TOP in the stack (0 ¾ i ¾ 7). For example, if TOP contains
1113 011B (register 3 is the top of the stack), the following statement would add
1114 the contents of two registers in the stack (registers 3 and 5):
1118 The stack organization and top-relative addressing of the numeric registers
1119 simplify subroutine programming by allowing routines to pass parameters on
1120 the register stack. By using the stack to pass parameters rather than using
1121 "dedicated" registers, calling routines gain more flexibility in how they
1122 use the stack. As long as the stack is not full, each routine simply loads
1123 the parameters onto the stack before calling a particular subroutine to
1124 perform a numeric calculation. The subroutine then addresses its parameters
1125 as ST, ST(1), etc., even though TOP may, for example, refer to physical
1126 register 3 in one invocation and physical register 5 in another.
1129 Figure 2-1. 80387 Register Set
1131 80387 DATA REGISTERS TAG
1134 ‚����Ð��������Ð������������������������������������ƒ ‚���ƒ
1135 R0€SIGN�EXPONENT� SIGNIFICAND € € €
1136 R1Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘Â
1137 R2Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘Â
1138 R3Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘Â
1139 R4Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘Â
1140 R5Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘Â
1141 R6Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘Â
1142 R7Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘Â
1143 „����¤��������¤������������������������������������… „���…
1146 ‚�������������������ƒ ‚�����������������������������������ƒ
1147 € CONTROL REGISTER € € INSTRUCTION POINTER €
1148 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1149 € STATUS REGISTER € € DATA POINTER €
1150 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ „�����������������������������������…
1152 „�������������������…
1155 2.1.2 The NPX Status Word
1157 The 16-bit status word shown in Figure 2-2 reflects the overall state of
1158 the 80387. This status word may be stored into memory using the
1159 FSTSW/FNSTSW, FSTENV/FNSTENV, and FSAVE/FNSAVE instructions, and can be
1160 transferred into the 80386 AX register with the FSTSW AX/FNSTSW AX
1161 instructions, allowing the NPX status to be inspected by the CPU.
1163 The B-bit (bit 15) is included for 8087 compatibility only. It reflects the
1164 contents of the ES bit (bit 7 of the status word), not the status of the
1165 BUSY# output of the 80387.
1167 The four NPX condition code bits (C{3}-C{0}) are similar to the flags in a
1168 CPU: the 80387 updates these bits to reflect the outcome of arithmetic
1169 operations. The effect of these instructions on the condition code bits is
1170 summarized in Table 2-1. These condition code bits are used principally for
1171 conditional branching. The FSTSW AX instruction stores the NPX status word
1172 directly into the CPU AX register, allowing these condition codes to be
1173 inspected efficiently by 80386 code. The 80386 SAHF instruction can copy
1174 C{3}-C{0} directly to 80386 flag bits to simplify conditional branching.
1175 Table 2-2 shows the mapping of these bits to the 80386 flag bits.
1177 Bits 12-14 of the status word point to the 80387 register that is the
1178 current Top of Stack (TOP). The significance of the stack top has been
1179 described in the prior section on the register stack.
1181 Figure 2-2 shows the six exception flags in bits 0-5 of the status word.
1182 Bit 7 is the exception summary status (ES) bit. ES is set if any unmasked
1183 exception bits are set, and is cleared otherwise. If this bit is set, the
1184 ERROR# signal is asserted. Bits 0-5 indicate whether the NPX has detected
1185 one of six possible exception conditions since these status bits were last
1186 cleared or reset. They are "sticky" bits, and can only be cleared by the
1187 instructions FINIT, FCLEX, FLDENV, FSAVE, and FRSTOR.
1189 Bit 6 is the stack fault (SF) bit. This bit distinguishes invalid
1190 operations due to stack overflow or underflow from other kinds of invalid
1191 operations. When SF is set, bit 9 (C{1}) distinguishes between stack
1192 overflow (C{1} = 1) and underflow (C{1} = 0).
1195 Figure 2-2. 80387 Status Word
1197 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ 80387 BUSY
1198 � ’‘‘‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ TOP OF STACK POINTER
1199 � ’‘‘‘�‘‘‘�‘‘‘�‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ CONDITION CODE
1200 \x1f \x1f \x1f \x1f \x1f \x1f \x1f \x1f
1202 ‚���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���ƒ
1203 € B � C � TOP � C � C � C � E � S � P � U � O � Z � D � I €
1204 € � 3 � � � � 2 � 1 � 0 � S � F � E � E � E � E � E � E €
1205 „���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���…
1206 \x1e \x1e \x1e \x1e \x1e \x1e \x1e \x1e
1207 ERROR SUMMARY STATUS ‘‘‘‘‘‘‘‘‘‘‘‘‘• � � � � � � �
1208 STACK FAULT ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• � � � � � �
1209 EXCEPTION FLAGS � � � � � �
1210 PRECISION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• � � � � �
1211 UNDERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• � � � �
1212 OVERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• � � �
1213 ZERO DIVIDE ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• � �
1214 DENORMALIZED OPERAND ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• �
1215 INVALID OPERATION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•
1217 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1219 ES IS SET IF ANY UNMASKED EXCEPTION BIT IS SET; CLEARED OTHERWISE.
1220 SEE TABLE 2-1 FOR INTERPRETATION OF CONDITION CODE.
1222 000 = REGISTER 0 IS TOP OF STACK
1223 001 = REGISTER 1 IS TOP OF STACK
1227 111 = REGISTER 7 IS TOP OF STACK
1228 FOR DEFINITIONS OF EXCEPTIONS, REFER TO CHAPTER 3.
1229 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1232 Table 2-1. Condition Code Interpretation
1235 Instruction C0 (S) C3 (Z) C1 (A) C2 (C)
1237 FPREM, FPREM1 Three least significant bits Reduction
1241 or O/U# 1=incomplete
1244 FCOMPP, FTST, Result of comparison Zero Operand is not
1245 FUCOM, FUCOMP, or O/U# comparable
1249 FXAM Operand class Sign Operand class
1254 FDECTOP, Constant UNDEFINED Zero UNDEFINED
1255 loads, FXTRACT, or O/U#
1262 FDIV, FDIVR, FSUB, UNDEFINED Roundup UNDEFINED
1263 FSUBR, FSCALE, or O/U#
1268 FPTAN, FSIN, UNDEFINED Roundup Reduction
1269 FCOS, FSINCOS or O/U# 0=complete
1270 undefined 1=incomplete
1273 FLDENV, FRSTOR Each bit loaded
1278 FSTCW, FSTSW, UNDEFINED
1283 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1285 O/U# When both IE and SF bits of status word are set,
1286 indicating a stack exception, this bit distinguishes
1287 between stack overflow (C1=1) and underflow (C1=0).
1289 Reduction If FPREM and FPREM1 produces a remainder that is less
1290 than the modulus, reduction is complete. When reduction
1291 is incomplete the value at the top of the stack is a
1292 partial remainder, which can be used as input to further
1293 reduction. For FPTAN, FSIN, FCOS, and FSINCOS, the
1294 reduction bit is set if the operand at the top of the
1295 stack is too large. In this case the original operand
1296 remains at the top of the stack.
1298 Roundup When the PE bit of the status word is set, this bit
1299 indicates whether the last rounding in the instruction
1302 UNDEFINED Do not rely on finding any specific value in these bits.
1303 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1306 Table 2-2. Correspondence between 80387 and 80386 Flag Bits
1308 80387 Flag 80386 Flag
1318 The NPX provides the programmer with several processing options, which are
1319 selected by loading a word from memory into the control word. Figure 2-3
1320 shows the format and encoding of the fields in the control word.
1322 The low-order byte of this control word configures the 80387 exception
1323 masking. Bits 0-5 of the control word contain individual masks for each of
1324 the six exception conditions recognized by the 80387. The high-order byte of
1325 the control word configures the 80387 processing options, including
1330 The precision-control bits (bits 8-9) can be used to set the 80387 internal
1331 operating precision at less than the default precision (64-bit significand).
1332 These control bits can be used to provide compatibility with the
1333 earlier-generation arithmetic processors having less precision than the
1334 80387. The precision-control bits affect the results of only the following
1335 five arithmetic instructions: ADD, SUB(R), MUL, DIV(R), and SQRT. No other
1336 operations are affected by PC.
1338 The rounding-control bits (bits 10-11) provide for the common
1339 round-to-nearest mode, as well as directed rounding and true chop. Rounding
1340 control affects only the arithmetic instructions (refer to Chapter 3 for
1341 lists of arithmetic and nonarithmetic instructions).
1344 Figure 2-3. 80387 Control Word Format
1346 ’‘‘‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘RESERVED
1347 � � � ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ (INFINITY CONTROL)
1348 This "infinity control" bit is not meaningful to the 80387. To maintain
1349 compatibility with the 80287, this bit can be programmed; however,
1350 regardless of its value, the 80387 treats infinity in the affine sense
1352 � � � � ’‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ ROUNDING CONTROL
1353 � � � � � � ’‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ PRECISION CONTROL
1354 \x1f \x1f \x1f \x1f \x1f \x1f \x1f \x1f
1356 ‚���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���ƒ
1357 € X X X � X � RC � PC � X X � P � U � O � Z � D � I €
1358 € � � � � � � � � � � M � M � M � M � M � M €
1359 „���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���…
1360 \x1e \x1e \x1e \x1e \x1e \x1e \x1e \x1e
1361 RESERVED ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘• � � � � � �
1362 EXECEPTION MASKS � � � � � �
1363 PRECISION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• � � � � �
1364 UNDERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• � � � �
1365 OVERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• � � �
1366 ZERO DIVIDE ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• � �
1367 DENORMALIZED OPERAND ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• �
1368 INVALID OPERATION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•
1370 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1372 PRECISION CONTROL ROUNDING CONTROL
1373 00--24 BITS (SINGLE PRECISION) 00--ROUND TO NEAREST OR EVEN
1374 01--(RESERVED) 01--ROUND DOWN (TOWARD -ý)
1375 10--53 BITS (DOUBLE PRECISION) 10--ROUND UP (TOWARD +ý)
1376 11--64 BITS (EXTENDED PRECISION) 11--CHOP (TRUNCATE TOWARDS ZERO)
1377 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1380 2.1.4 The NPX Tag Word
1382 The tag word indicates the contents of each register in the register stack,
1383 as shown in Figure 2-4. The tag word is used by the NPX itself to
1384 distinguish between empty and nonempty register locations. Programmers of
1385 exception handlers may use this tag information to check the contents of a
1386 numeric register without performing complex decoding of the actual data in
1387 the register. The tag values from the tag word correspond to physical
1388 registers 0-7. Programmers must use the current top-of-stack (TOP) pointer
1389 stored in the NPX status word to associate these tag values with the
1390 relative stack registers ST(0) through ST(7).
1392 The exact values of the tags are generated during execution of the FSTENV
1393 and FSAVE instructions according to the actual contents of the nonempty
1394 stack locations. During execution of other instructions, the 80387 updates
1395 the TW only to indicate whether a stack location is empty or nonempty.
1398 Figure 2-4. 80387 Tag Word Format
1401 ‚����Ð���Ð����Ð���Ð����Ð���Ð����Ð���Ð����Ð���Ð����Ð���Ð����Ð���Ð����Ð���ƒ
1402 € TAG (7)� TAG (6)� TAG (5)� TAG (4)� TAG (3)� TAG (2)� TAG (1)� TAG (0)€
1403 „����¤���¤����¤���¤����¤���¤����¤���¤����¤���¤����¤���¤����¤���¤����¤���…
1407 10 = INVALID OR INFINITY
1411 2.1.5 The NPX Instruction and Data Pointers
1413 The instruction and data pointers provide support for programmed
1414 exception-handlers. These registers are actually located in the 80386, but
1415 appear to be located in the 80387 because they are accessed by the ESC
1416 instructions FLDENV, FSTENV, FSAVE, and FRSTOR. Whenever the 80386 decodes
1417 an ESC instruction, it saves the instruction address, the operand address
1418 (if present), and the instruction opcode.
1420 When stored in memory, the instruction and data pointers appear in one of
1421 four formats, depending on the operating mode of the 80386 (protected mode
1422 or real-address mode) and depending on the operand-size attribute in effect
1423 (32-bit operand or 16-bit operand). When the 80386 is in virtual-8086 mode,
1424 the real-address mode formats are used.
1426 Figures 2-5 through 2-8 show these pointers as they are stored following an
1429 The FSTENV and FSAVE instructions store this data into memory, allowing
1430 exception handlers to determine the precise nature of any numeric exceptions
1431 that may be encountered.
1433 The instruction address saved in the 80386 (as in the 80287) points to any
1434 prefixes that preceded the instruction. This is different from the 8087, for
1435 which the instruction address points only to the ESC instruction opcode.
1437 Note that the processor control instructions FINIT, FLDCW, FSTCW, FSTSW,
1438 FCLEX, FSTENV, FLDENV, FSAVE, FRSTOR, and FWAIT do not affect the data
1439 pointer. Note also that, except for the instructions just mentioned, the
1440 value of the data pointer is undefined if the prior ESC instruction did not
1441 have a memory operand.
1444 Figure 2-5. Protected Mode 80387 Instruction and Data Pointer Image in
1445 Memory, 32-Bit Format
1447 32-BIT PROTECTED MODE FORMAT
1450 ‚�����������������Ï�����������������Ï�����������������Ï�����������������ƒ
1451 € RESERVED � CONTROL WORD €0H
1452 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1453 € RESERVED � STATUS WORD €4H
1454 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1455 € RESERVED � TAG WORD €8H
1456 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1458 Ñ‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1459 € 0 0 0 0 0 � OPCODE 10..0 � CS SELECTOR €10H
1460 Ñ‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1461 € DATA OPERAND OFFSET €14H
1462 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1463 € RESERVED � OPERAND SELECTOR €18H
1464 „�����������������Ï�����������������Ï�����������������Ï�����������������…
1467 Figure 2-6. Real Mode 80387 Instruction and Data Pointer Image in
1468 Memory, 32-Bit Format
1470 32-BIT REAL ADDRESS MODE FORMAT
1473 ‚�����������������Ï�����������������Ï�����������������Ï�����������������ƒ
1474 € RESERVED � CONTROL WORD €0H
1475 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1476 € RESERVED � STATUS WORD €4H
1477 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1478 € RESERVED � TAG WORD €8H
1479 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1480 € RESERVED � INSTRUCTION POINTER 15..0 €CH
1481 Ñ‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘˜‘˜‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1482 € 0 0 0 0 � INSTRUCTION POINTER 31..16 �0� OPCODE 10..0 €10H
1483 Ñ‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘™‘™‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1484 € RESERVED � OPERAND POINTER €14H
1485 Ñ‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1486 € 0 0 0 0 � OPERAND POINTER 31..16 �0 0 0 0 0 0 0 0 0 0 0 0€18H
1487 „���������¤�������Ï�����������������Ï�����������¤�����Ï�����������������…
1490 Figure 2-7. Protected Mode 80387 Instruction and Data Pointer Image in
1491 Memory, 16-Bit Format
1493 16-BIT PROTECTED MODE FORMAT
1496 ‚�����������������Ï����������������ƒ
1498 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1500 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1502 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1504 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1506 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1507 € OPERAND OFFSET € AH
1508 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1509 € OPERAND SELECTOR € CH
1510 „�����������������Ï����������������…
1513 Figure 2-8. Real Mode 80387 Instruction and Data Pointer Image in
1514 Memory, 16-Bit Format
1516 16-BIT REAL-ADDRESS MODE
1517 AND VIRTUAL-8086 MODE FORMAT
1520 ‚����������������Ï����������������ƒ
1522 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1524 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1526 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1527 € INSTRUCTION POINTER 15..0 € 6H
1528 Ñ‘‘‘‘‘‘‘‘˜‘˜‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1529 €IP 19..16�0� OPCODE 10..0 € 8H
1530 Ñ‘‘‘‘‘‘‘‘™‘™‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1531 € OPERAND POINTER 15..0 € AH
1532 Ñ‘‘‘‘‘‘‘‘˜‘˜‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1533 €OP 19..16�0�0 0 0 0 0 0 0 0 0 0 0€ CH
1534 „���������¤�¤����Ï����������������…
1537 2.2 Computation Fundamentals
1539 This section covers 80387 programming concepts that are common to all
1540 applications. It describes the 80387's internal number system and the
1541 various types of numbers that can be employed in NPX programs. The most
1542 commonly used options for rounding and precision (selected by fields in the
1543 control word) are described, with exhaustive coverage of less frequently
1544 used facilities deferred to later sections. Exception conditions that may
1545 arise during execution of NPX instructions are also described along with the
1546 options that are available for responding to these exceptions.
1551 The system of real numbers that people use for pencil and paper
1552 calculations is conceptually infinite and continuous. There is no upper or
1553 lower limit to the magnitude of the numbers one can employ in a calculation,
1554 or to the precision (number of significant digits) that the numbers can
1555 represent. When considering any real number, there are always arbitrarily
1556 many numbers both larger and smaller. There are also arbitrarily many
1557 numbers between (i.e., with more significant digits than) any two real
1558 numbers. For example, between 2.5 and 2.6 are 2.51, 2.5897, 2.500001, etc.
1560 While ideally it would be desirable for a computer to be able to operate on
1561 the entire real number system, in practice this is not possible. Computers,
1562 no matter how large, ultimately have fixed-size registers and memories that
1563 limit the system of numbers that can be accommodated. These limitations
1564 determine both the range and the precision of numbers. The result is a set
1565 of numbers that is finite and discrete, rather than infinite and
1566 continuous. This sequence is a subset of the real numbers that is designed
1567 to form a useful approximation of the real number system.
1569 Figure 2-9 superimposes the basic 80387 real number system on a real number
1570 line (decimal numbers are shown for clarity, although the 80387 actually
1571 represents numbers in binary). The dots indicate the subset of real numbers
1572 the 80387 can represent as data and final results of calculations. The
1573 80387's range of double-precision, normalized numbers is approximately
1574 ±2.23 * 10^(-308) to ±1.80 * 10^(308). Applications that are required to
1575 deal with data and final results outside this range are rare. For reference,
1576 the range of the IBM System 370* is about ±0.54 * 10^(-78) to
1579 The finite spacing in Figure 2-9 illustrates that the NPX can represent a
1580 great many, but not all, of the real numbers in its range. There is always a
1581 gap between two adjacent 80387 numbers, and it is possible for the result of
1582 a calculation to fall in this space. When this occurs, the NPX rounds the
1583 true result to a number that it can represent. Thus, a real number that
1584 requires more digits than the 80387 can accommodate (e.g., a 20-digit
1585 number) is represented with some loss of accuracy. Notice also that the
1586 80387's representable numbers are not distributed evenly along the real
1587 number line. In fact, an equal number of representable numbers exists
1588 between successive powers of 2 (i.e., as many representable numbers exist
1589 between 2 and 4 as between 65,536 and 131,072). Therefore, the gaps between
1590 representable numbers are larger as the numbers increase in magnitude. All
1591 integers in the range ±2^(64) (approximately ±10^(18)), however, are exactly
1594 In its internal operations, the 80387 actually employs a number system that
1595 is a substantial superset of that shown in Figure 2-9. The internal format
1596 (called extended real) extends the 80387's range to about ±3.30 * 10^(-4932)
1597 to ±1.21 * 10^(4932), and its precision to about 19 (equivalent decimal)
1598 digits. This format is designed to provide extra range and precision for
1599 constants and intermediate results, and is not normally intended for data
1602 From a practical standpoint, the 80387's set of real numbers is
1603 sufficiently large and dense so as not to limit the vast majority of
1604 microprocessor applications. Compared to most computers, including
1605 mainframes, the NPX provides a very good approximation of the real number
1606 system. It is important to remember, however, that it is not an exact
1607 representation, and that arithmetic on real numbers is inherently
1610 Conversely, and equally important, the 80387 does perform exact arithmetic
1611 on integer operands. That is, if an operation on two integers is valid and
1612 produces a result that is in range, the result is exact. For example, 4 ÷ 2
1613 yields an exact integer, 1 ÷ 3 does not, and 2^(40) * 2^(30) + 1 does not,
1614 because the result requires greater than 64 bits of precision.
1617 Figure 2-9. 80387 Double-Precision Number System
1619 |
\x11‘‘‘ NEGATIVE RANGE (NORMALIZED) ‘‘
\x10|
1622 ’‘‘‘˜‘‘‘˜‘‘˜“’‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘“
1623 � � � ���›››�›››�œœœ�œœœ���������
1624 ”‘‘‘™‘‘‘™‘‘™•”‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘•
1626 � -2.23 X 10^(-308)•
1628 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“
1629 � ‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘ �
1630 � ��������œœœœœœœœœ �
1631 |
\x11‘‘ POSITIVE RANGE (NORMALIZED) ‘‘‘
\x10| � ��������œœœœœœœœœ �
1632 | | � ‘¨‘‘‘‘‘¨‘‘‘‘‘¨‘‘‘ �
1633 | 1 2 3 4 5 | � �
\x11‘˜‘
\x10� �
1634 ’‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘“’˜‘‘˜‘‘‘˜‘‘‘“ � � � ”2.00000000000000000 �
1635 ���������œœœ�œœœ�›››�›››��� � � � � � ” (NOT REPRESENTABLE) �
1636 ”‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘•”™‘‘™‘‘‘™‘‘‘• � ”‘‘‘‘‘‘1.99999999999999999 �
1637 \x1e ”‘‘‘—
\x1e � PRECISION�
\x11‘ 18 DIGITS ‘
\x10� �
1638 � ”‘‘‘‘‘‘‘‘“ 1.80 X 10^(308)• � �
1639 ” 2.23 X 10^(-308) ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•
1642 2.2.2 Data Types and Formats
1644 The 80387 recognizes seven numeric data types for memory-based values,
1645 divided into three classes: binary integers, packed decimal integers, and
1646 binary reals. A later section describes how these formats are stored in
1647 memory (the sign is always located in the highest-addressed byte).
1649 Figure 2-10 summarizes the format of each data type. In the figure, the
1650 most significant digits of all numbers (and fields within numbers) are the
1654 Figure 2-10. 80387 Data Formats
1656 ’‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“
1657 � � � �MOST HIGHEST ADDRESSED �
1658 �DATA � RANGE �PRECISION�SIGNIFICANT BYTE BYTE �
1659 �FORMATS � � –‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘“ �
1660 � � � �7 0�7 0�7 0�7 0�7 0�7 0�7 0�7 0�7 0�7 0� �
1661 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘—
1662 �WORD � � –‘‘‘‘‘‘“(TWO'S �
1663 �INTEGER � 10^(4) � 16 BITS –‘‘‘‘‘‘•COMPLEMENT) �
1665 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1666 �SHORT � � –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“(TWO'S �
1667 �INTEGER � 10^(2) � 32 BITS –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•COMPLEMENT) �
1669 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1670 �LONG � � –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“(TWO'S �
1671 �INTEGER � 10^(19) � 64 BITS –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•COMPLEMENT)�
1673 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1675 �PACKED � � –‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘¨¨¨‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“ �
1676 �BCD � 10^(18) �18 DIGITS�S� X �d{17} d{16} d{2} d{1} d{0}� �
1677 � � � –‘™‘‘‘™‘‘‘‘‘™‘‘‘‘‘™‘¨¨¨‘™‘‘‘‘‘™‘‘‘‘‘™‘‘‘‘‘• �
1679 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1680 � � � –‘˜‘‘‘‘‘˜‘‘‘‘‘‘‘“ �
1681 �SINGLE � 10^(±38)� 24 BITS �S� BE � SIGN. � �
1682 �PRECISION� � –‘™‘‘‘‘‘™‘‘‘‘‘‘‘• �
1684 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1685 � � � –‘˜‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“ �
1686 �DOUBLE �10^(±308)� 53 BITS �S� BE � SIGNIFICAND � �
1687 �PRECISION� � –‘™‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• �
1689 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1690 � � � –‘˜‘‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“�
1691 �EXTENDED �10^(4932)� 64 BITS �S� BE –‘“ SIGNIFICAND ��
1692 �PRECISION� � –‘™‘‘‘‘‘‘‘‘‘‘‘‘™I™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•�
1693 � � � �79 64 63
\x1e 0 �
1694 ”‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•
1696 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1698 (1) BE = BIASED EXPONENT
1699 (2) S = SIGN BIT (0 = positive, 1 = negative)
1700 (3) d{n} = DECIMAL DIGIT (TWO PER TYPE)
1701 (4) X = BITS HAVE NO SIGNIFICANCE; 80387 IGNORES WHEN LOADING,
1702 ZEROS IN WHEN STORING
1703 (5)
\x1e = POSITION OF IMPLICIT BINARY POINT
1704 (6) I = INTEGER BIT OF SIGNIFICAND; STORED IN TEMPORARY REAL,
1705 IMPLICIT IN SINGLE AND DOUBLE PRECISION
1706 (7) EXPONENT BIAS (NORMALIZED VALUES):
1709 EXTENDED REAL: 16383 (3FFFH)
1710 (8) PACKED BCD: (-1)^(S) (D{17}...D{0})
1711 (9) REAL: (-1)^(S) (2^(E-BIAS)) (F{0}F{1}...)
1712 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1715 2.2.2.1 Binary Integers
1717 The three binary integer formats are identical except for length, which
1718 governs the range that can be accommodated in each format. The leftmost bit
1719 is interpreted as the number's sign: 0 = positive and 1 = negative. Negative
1720 numbers are represented in standard two's complement notation (the binary
1721 integers are the only 80387 format to use two's complement). The quantity
1722 zero is represented with a positive sign (all bits are 0). The 80387 word
1723 integer format is identical to the 16-bit signed integer data type of the
1724 80386; the 80387 short integer format is identical to the 32-bit signed
1725 integer data type of the 80386.
1727 The binary integer formats exist in memory only. When used by the 80387,
1728 they are automatically converted to the 80-bit extended real format. All
1729 binary integers are exactly representable in the extended real format.
1732 2.2.2.2 Decimal Integers
1734 Decimal integers are stored in packed decimal notation, with two decimal
1735 digits "packed" into each byte, except the leftmost byte, which carries the
1736 sign bit (0 = positive, 1 = negative). Negative numbers are not stored in
1737 two's complement form and are distinguished from positive numbers only by
1738 the sign bit. The most significant digit of the number is the leftmost
1739 digit. All digits must be in the range 0-9.
1741 The decimal integer format exists in memory only. When used by the 80387,
1742 it is automatically converted to the 80-bit extended real format. All
1743 decimal integers are exactly representable in the extended real format.
1746 2.2.2.3 Real Numbers
1748 The 80387 represents real numbers of the form:
1750 (-1)^(s)2^(E)(b{0
\x1e}b{1}b{2}b{3}..b{p-1})
1755 E = any integer between Emin and Emax, inclusive
1757 p = number of bits of precision
1759 Table 2-3 summarizes the parameters for each of the three real-number
1762 The 80387 stores real numbers in a three-field binary format that resembles
1763 scientific, or exponential, notation. The format consists of the following
1766 Ž The number's significant digits are held in the significand field,
1767 b{0
\x1e} b{1} b{2} b{3}..b{p-1}. (The term "significand" is analogous
1768 to the term "mantissa" used to describe floating point numbers on some
1771 Ž The exponent field, e = E+bias, locates the binary point within the
1772 significant digits (and therefore determines the number's magnitude).
1773 (The term "exponent" is analogous to the term "characteristic" used to
1774 describe floating point numbers on somecomputers.)
1776 Ž The 1-bit sign field indicates whether the number is positive or
1777 negative. Negative numbers differ from positive numbers only in the
1778 sign bits of their significands.
1780 Table 2-4 shows how the real number 178.125 (decimal) is stored in the
1781 80387 single real format. The table lists a progression of equivalent
1782 notations that express the same value to show how a number can be converted
1783 from one form to another. (The ASM386 and PL/M-386 language translators
1784 perform a similar process when they encounter programmer-defined real number
1785 constants.) Note that not every decimal fraction has an exact binary
1786 equivalent. The decimal number 1/10, for example, cannot be expressed
1787 exactly in binary (just as the number 1/3 cannot be expressed exactly in
1788 decimal). When a translator encounters such a value, it produces a rounded
1789 binary approximation of the decimal value.
1791 The NPX usually carries the digits of the significand in normalized form.
1792 This means that, except for the value zero, the significand contains an
1793 integer bit and fraction bits as follows:
1797 where {
\x1e} indicates an assumed binary point. The number of fraction bits
1798 varies according to the real format: 23 for single, 52 for double, and 63
1799 for extended real. By normalizing real numbers so that their integer bit is
1800 always a 1, the 80387 eliminates leading zeros in small values (�X� < 1).
1801 This technique maximizes the number of significant digits that can be
1802 accommodated in a significand of a given width. Note that, in the single
1803 and double formats, the integer bit is implicit and is not actually stored;
1804 the integer bit is physically present in the extended format only.
1806 If one were to examine only the significand with its assumed binary point,
1807 all normalized real numbers would have values greater than or equal to 1 and
1808 less than 2. The exponent field locates the actual binary point in the
1809 significant digits. Just as in decimal scientific notation, a positive
1810 exponent has the effect of moving the binary point to the right, and a
1811 negative exponent effectively moves the binary point to the left, inserting
1812 leading zeros as necessary. An unbiased exponent of zero indicates that the
1813 position of the assumed binary point is also the position of the actual
1814 binary point. The exponent field, then, determines a real number's
1817 In order to simplify comparing real numbers (e.g., for sorting), the 80387
1818 stores exponents in a biased form. This means that a constant is added to
1819 the true exponent described above. As Table 2-3 shows, the value of this
1820 bias is different for each real format. It has been chosen so as to
1821 force the biased exponent to be a positive value. This allows two real
1822 numbers (of the same format and sign) to be compared as if they are unsigned
1823 binary integers. That is, when comparing them bitwise from left to right
1824 (beginning with the leftmost exponent bit), the first bit position that
1825 differs orders the numbers; there is no need to proceed further with the
1826 comparison. A number's true exponent can be determined simply by
1827 subtracting the bias value of its format.
1829 The single and double real formats exist in memory only. If a number in one
1830 of these formats is loaded into an 80387 register, it is automatically
1831 converted to extended format, the format used for all internal operations.
1832 Likewise, data in registers can be converted to single or double real for
1833 storage in memory. The extended real format may be used in memory also,
1834 typically to store intermediate results that cannot be held in registers.
1836 Most applications should use the double format to store real-number data
1837 and results; it provides sufficient range and precision to return correct
1838 results with a minimum of programmer attention. The single real format is
1839 appropriate for applications that are constrained by memory, but it should
1840 be recognized that this format provides a smaller margin of safety. It is
1841 also useful for the debugging of algorithms, because roundoff problems will
1842 manifest themselves more quickly in this format. The extended real format
1843 should normally be reserved for holding intermediate results, loop
1844 accumulations, and constants. Its extra length is designed to shield final
1845 results from the effects of rounding and overflow/underflow in intermediate
1846 calculations. However, the range and precision of the double format are
1847 adequate for most microcomputer applications.
1850 Table 2-3. Summary of Format Parameters
1852 Parameter ’‘‘‘‘‘‘‘‘ Format ‘‘‘‘‘‘‘‘“
1853 Single Double Extended
1855 Format width in bits 32 64 80
1856 p (bits of precision) 24 53 64
1857 Exponent width in bits 8 11 15
1858 Emax +127 +1023 +16383
1859 Emin -126 -1022 -16382
1860 Exponent bias +127 +1023 +16383
1863 Table 2-4. Real Number Notation
1867 Ordinary Decimal 178.125
1868 Scientific Decimal 1{
\x1e}78125E2
1869 Scientific Binary 1{
\x1e}0110010001E111
1870 Scientific Binary 1{
\x1e}0110010001E10000110
1872 80387 Single Format Sign Biased Exponent Significand
1873 (Normalized) 0 10000110 01100100010000000000000
1877 2.2.3 Rounding Control
1879 Internally, the 80387 employs three extra bits (guard, round, and sticky
1880 bits) that enable it to round numbers in accord with the infinitely precise
1881 true result of a computation; these bits are not accessible to programmers.
1882 Whenever the destination can represent the infinitely precise true result,
1883 the 80387 delivers it. Rounding occurs in arithmetic and store operations
1884 when the format of the destination cannot exactly represent the infinitely
1885 precise true result. For example, a real number may be rounded if it is
1886 stored in a shorter real format, or in an integer format. Or, the infinitely
1887 precise true result may be rounded when it is returned to a register.
1889 The NPX has four rounding modes, selectable by the RC field in the control
1890 word (see Figure 2-3). Given a true result b that cannot be represented by
1891 the target data type, the 80387 determines the two representable numbers a
1892 and c that most closely bracket b in value (a < b < c). The processor then
1893 rounds (changes) b to a or to c according to the mode selected by the RC
1894 field as shown in Table 2-5. Rounding introduces an error in a result that
1895 is less than one unit in the last place to which the result is rounded.
1897 Ž "Round to nearest" is the default mode and is suitable for most
1898 applications; it provides the most accurate and statistically unbiased
1899 estimate of the true result.
1901 Ž The "chop" or "round toward zero" mode is provided for integer
1902 arithmeticapplications.
1904 Ž "Round up" and "round down" are termed directed rounding and can be
1905 used to implement interval arithmetic. Interval arithmetic generates a
1906 certifiable result independent of the occurrence of rounding and other
1907 errors. The upper and lower bounds of an interval may be computed by
1908 executing an algorithm twice, rounding up in one pass and down in the
1911 Rounding control affects only the arithmetic instructions (refer to Chapter
1912 3 for lists of arithmetic and nonarithmetic instructions).
1915 2.2.4 Precision Control
1917 The 80387 allows results to be calculated with either 64, 53, or 24 bits of
1918 precision in the significand as selected by the precision control (PC) field
1919 of the control word. The default setting, and the one that is best suited
1920 for most applications, is the full 64 bits of significance provided by the
1921 extended real format. The other settings are required by the IEEE standard
1922 and are provided to obtain compatibility with the specifications of certain
1923 existing programming languages. Specifying less precision nullifies the
1924 advantages of the extended format's extended fraction length. When reduced
1925 precision is specified, the rounding of the fractional value clears the
1926 unused bits on the right to zeros.
1929 Table 2-5. Rounding Modes
1931 RC Field Rounding Mode Rounding Action
1933 00 Round to nearest Closer to b of a or c; if equally
1934 close, select even number (the one
1935 whose least significant bit is zero).
1936 01 Round down (toward -ý) a
1937 10 Round up (toward +ý) c
1938 11 Chop (toward 0) Smaller in magnitude of a or c.
1940 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1942 a < b < c; a and c are successive representable numbers; b is not
1944 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1947 Chapter 3 Special Computational Situations
1949 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1951 Besides being able to represent positive and negative numbers, the 80387
1952 data formats may be used to describe other entities. These special values
1953 provide extra flexibility, but most users will not need to understand them
1954 in order to use the 80387 successfully. This section describes the special
1955 values that may occur in certain cases and the significance of each. The
1956 80387 exceptions are also described, for writers of exception handlers and
1957 for those interested in probing the limits of computation using the 80387.
1959 The material presented in this section is mainly of interest to programmers
1960 concerned with writing exception handlers. Many readers will only need to
1963 When discussing these special computational situations, it is useful to
1964 distinguish between arithmetic instructions and nonarithmetic instructions.
1965 Nonarithmetic instructions are those that have no operands or transfer their
1966 operands without substantial change; arithmetic instructions are those that
1967 make significant changes to their operands. Table 3-1 defines these two
1968 classes of instructions.
1971 Table 3-1. Arithmetic and Nonarithmetic Instructions
1974 Nonarithmetic Instructions Arithmetic Instructions
1983 FLD (register-to-register) FIADD
1984 FLD (extended format from memory) FICOM(P)
1985 FLD constant FIDIV(R)
1990 FSAVE FLD (conversion)
1991 FST(P) (register-to-register) FMUL(P)
1992 FSTP (extended format to memory) FPATAN
2011 3.1 Special Numeric Values
2013 The 80387 data formats encompass encodings for a variety of special values
2014 in addition to the typical real or integer data values that result from
2015 normal calculations. These special values have significance and can express
2016 relevant information about the computations or operations that produced
2017 them. The various types of special values are
2019 Ž Denormal real numbers
2021 Ž Positive and negative infinity
2022 Ž NaN (Not-a-Number)
2024 Ž Unsupported formats
2026 The following sections explain the origins and significance of each of
2027 these special values. Tables 3-6 through 3-9 at the end of this section
2028 show how each of these special values is encoded for each of the numeric
2032 3.1.1 Denormal Real Numbers
2034 The 80387 generally stores nonzero real numbers in normalized
2035 floating-point form; that is, the integer (leading) bit of the significand
2036 is always a one. (Refer to Chapter 2 for a review of operand formats.) This
2037 bit is explicitly stored in the extended format, and is implicitly assumed
2038 to be a one (1{
\x1e}) in the single and double formats. Since leading zeros are
2039 eliminated, normalized storage allows the maximum number of significant
2040 digits to be held in a significand of a given width.
2042 When a numeric value becomes very close to zero, normalized floating-point
2043 storage cannot be used to express the value accurately. The term tiny is
2044 used here to precisely define what values require special handling by the
2045 80387. A number R is said to be tiny when -2{Emin} < R < 0 or
2046 0 < R < +2{Emin}. (As defined in Chapter 2, Emin is -126 for single format,
2047 -1022 for double format, and -16382 for extended format.) In other words, a
2048 nonzero number is tiny if its exponent would be too negative to store in the
2051 To accommodate these instances, the 80387 can store and operate on reals
2052 that are not normalized, i.e., whose significands contain one or more
2053 leading zeros. Denormals typically arise when the result of a calculation
2054 yields a value that is tiny.
2056 Denormal values have the following properties:
2058 Ž The biased floating-point exponent is stored at its smallest value
2061 Ž The integer bit of the significand (whether explicit or implicit) is
2064 The leading zeros of denormals permit smaller numbers to be represented, at
2065 the possible cost of some lost precision (the number of significant bits is
2066 reduced by the leading zeros). In typical algorithms, extremely small values
2067 are most likely to be generated as intermediate, rather than final, results.
2068 By using the NPX's extended real format for holding intermediate values,
2069 quantities as small as ±3.4*10{-4932} can be represented; this makes the
2070 occurrence of denormal numbers a rare phenomenon in 80387 applications.
2071 Nevertheless, the NPX can load, store, and operate on denormalized real
2072 numbers when they do occur.
2074 Denormals receive special treatment by the 80387 in three respects:
2076 Ž The 80387 avoids creating denormals whenever possible. In other words,
2077 it always normalizes real numbers except in the case of tiny numbers.
2079 Ž The 80387 provides the unmasked underflow exception to permit
2080 programmers to detect cases when denormals would be created.
2082 Ž The 80387 provides the denormal exception to permit programmers to
2083 detect cases when denormals enter into further calculations.
2085 Denormalizing means incrementing the true result's exponent and inserting a
2086 corresponding leading zero in the significand, shifting the rest of the
2087 significand one place to the right. Denormal values may occur in any of the
2088 single, double, or extended formats. Table 3-2 illustrates how a result
2089 might be denormalized to fit a single format destination.
2091 Denormalization produces either a denormal or a zero. Denormals are readily
2092 identified by their exponents, which are always the minimum for their
2093 formats; in biased form, this is always the bit string: 00..00. This same
2094 exponent value is also assigned to the zeros, but a denormal has a nonzero
2095 significand. A denormal in a register is tagged special. Tables 3-8 and
2096 3-9 show how denormal values are encoded in each of the real data formats.
2098 The denormalization process causes loss of significance if low-order
2099 one-bits bits are shifted off the right of the significand. In a severe
2100 case, all the significand bits of the true result are shifted out and
2101 replaced by the leading zeros. In this case, the result of denormalization
2102 is a true zero, and, if the value is in a register, it is tagged as a zero.
2104 Denormals are rarely encountered in most applications. Typical debugged
2105 algorithms generate extremely small results during the evaluation of
2106 intermediate subexpressions; the final result is usually of an appropriate
2107 magnitude for its single or double format real destination. If intermediate
2108 results are held in temporary real, as is recommended, the great range of
2109 this format makes underflow very unlikely. Denormals are likely to arise
2110 only when an application generates a great many intermediates, so many that
2111 they cannot be held on the register stack or in extended format memory
2112 variables. If storage limitations force the use of single or double format
2113 reals for intermediates, and small values are produced, underflow may occur,
2114 and, if masked, may generate denormals.
2116 When a denormal number is single or double format is used as a source
2117 operand and the denormal exception is masked, the 80387 automatically
2118 normalizes the number when it is converted to extended format.
2121 Table 3-2. Denormalization Process
2123 Operation Sign Exponent Significand
2125 True Result 0 -129 1{
\x1e}01011100..00
2126 Denormalize 0 -128 0{
\x1e}101011100..00
2127 Denormalize 0 -127 0{
\x1e}0101011100..00
2128 Denormalize 0 -126 0{
\x1e}00101011100..00
2129 Denormal Result 0 -126 0{
\x1e}00101011100..00
2132 3.1.1.1 Denormals and Gradual Underflow
2134 Floating-point arithmetic cannot carry out all operations exactly for all
2135 operands; approximation is unavoidable when the exact result is not
2136 representable as a floating-point variable. To keep the approximation
2137 mathematically tractable, the hardware is made to conform to accuracy
2138 standards that can be modeled by certain inequalities instead of equations.
2141 X
\e Y @ Z (where @ is some operation)
2143 represent a typical operation. In the default rounding mode (round to
2144 nearest), each operation is carried out with an absolute error no larger
2145 than half the separation between the two floating-point numbers closest to
2146 the exact results. Let x be the value stored for the variable whose name in
2147 the program is X, and similarly y for Y, and z for Z. Normally y and z will
2148 differ by accumulated errors from what is desired and from what would have
2149 been obtained in the absence of error. For the calculation of x we assume
2150 that y and z are the best approximations available, and we seek to compute x
2151 as well as we can. If y@z is representable exactly, then we expect x = y@z,
2152 and that is what we get for every algebraic operation on the 80387 (i.e.,
2153 when y@z is one of y+z, y-z, y*z, y÷z, sqrt z). But if y@z must be
2154 approximated, as is usually the case, then x must differ from y@z by no
2155 more than half the difference between the two representable numbers that
2156 straddle y@z. That difference depends on two factors:
2158 1. The precision to which the calculation is carried out, as determined
2159 either by the precision control bits or by the format used in memory.
2160 On the 80387, the precisions are single (24 significant bits), double
2161 (53 significant bits), and extended (64 significant bits).
2163 2. How close y@z is to zero. In this respect the presence of denormal
2164 numbers on the 80387 provides a distinct advantage over systems that
2165 do not admit denormal numbers.
2167 In any floating-point number system, the density of representable numbers
2168 is greater near zero than near the largest representable magnitudes.
2169 However, machines that do not use denormal numbers suffer from an enormous
2170 gap between zero and its closest neighbors. Figures 3-1 and 3-2 show what
2171 happens near zero in two kinds of floating-point number systems.
2173 Figure 3-1 shows a floating-point number system that (like the 80387)
2174 admits denormal numbers. For simplicity, only the non-negative numbers
2175 appear and the figure illustrates a number system that carries just four
2176 significant bits instead of the 24, 53, or 64 significant bits that the
2179 Each vertical mark stands for a number representable in four significant
2180 bits, and the bolder marks stand for the normal powers of 2. The denormal
2181 numbers lie between 0 and the nearest normal power of 2. They are no less
2182 dense than the remaining normal nonzero numbers.
2184 Figure 3-2 shows a floating-point number system that (unlike the 80387)
2185 does not admit denormal numbers. There are two yawning gaps, one on the
2186 positive side of zero (as illustrated) and one on the negative side of zero
2187 (not illustrated). The gap between zero and the nearest neighbor of zero
2188 differs from the gap between that neighbor and the next bigger number by a
2189 factor of about 8.4 * 10^(6) for single, 4.5 * 10^(15) for double, and
2190 9.2*10^(18) for extended format. Those gaps would horribly complicate error
2193 The advantage of denormal numbers is apparent when one considers what
2194 happens in either case when the underflow exception is masked and y@z falls
2195 into the space between zero and the smallest normal magnitude. The 80387
2196 returns the nearest denormal number. This action might be called "gradual
2197 underflow." The effect is no different than the rounding that can occur when
2198 y@z falls in the normal range.
2200 On the other hand, the system that does not have denormal numbers returns
2201 zero as the result, an action that can be much more inaccurate than
2202 rounding. This action could be called "abrupt underflow."
2205 Figure 3-1. Floating-Point System with Denormals
2207 0+++++++�+++++++�-+-+-+-+-+-+-+-�---+---+---+---+---+---+---+---�------+...
2209 ”‘‘˜‘‘• - - - - - - - - Normal Numbers - - - - - -
\x10
2213 Figure 3-2. Floating-Point System without Denormals
2215 0 �+++++++�-+-+-+-+-+-+-+-�---+---+---+---+---+---+---+---�------+---...
2217 - - - - - - - - Normal Numbers - - - - - -
\x10
2222 The value zero in the real and decimal integer formats may be signed either
2223 positive or negative, although the sign of a binary integer zero is always
2224 positive. For computational purposes, the value of zero always behaves
2225 identically, regardless of sign, and typically the fact that a zero may be
2226 signed is transparent to the programmer. If necessary, the FXAM instruction
2227 may be used to determine a zero's sign.
2229 If a zero is loaded or generated in a register, the register is tagged
2230 zero. Table 3-3 lists the results of instructions executed with zero
2231 operands and also shows how a zero may be created from nonzero operands.
2234 Table 3-3. Zero Operands and Results
2237 Key to symbols used in this table
2238 X and Y denote nonzero operand.
2239 * Sign of original zero operand.
2240 # Sign of original X operand.
2241 -# Compliment of sign of original X operand.
2242 Þ Exclusive OR of the signs of the operands.
2245 Operation Operands Result
2253 When extreme underflow denormalizes the result to zero.
2257 When extreme underflow denormalizes the result to zero.
2265 When 0 < X < 1 and rounding mode is not up.
2269 When 0 < X < 1 and rounding mode is not up.
2272 Addition +0 plus +0 +0
2274 +0 plus -0, -0 plus +0 ±0
2275 Sign determined by rounding mode: + for nearest, up, or chop, - for down.
2278 -X plus +X, +X plus -X ±0
2279 Sign determined by rounding mode: + for nearest, up, or chop, - for down.
2282 ±0 plus ±X, ±X plus ±0 #X
2283 Subtraction +0 minus -0+0
2285 +0 minus +0, -0 minus -0 ±0
2286 Sign determined by rounding mode: + for nearest, up, or chop, - for down.
2289 +X minus +X, -X minus -X ±0
2290 Sign determined by rounding mode: + for nearest, up, or chop, - for down.
2295 Multiplication +0 * +0, -0 * -0 +0
2300 Multiplication -0 * -X, -X * -0 +0
2302 When extreme underflow denormalizes the result to zero.
2306 When extreme underflow denormalizes the result to zero.
2309 Division ±0 ÷ ±0 Invalid Operation
2310 ±X ÷ ±0 Þý (Zero Divide)
2314 When extreme underflow denormalizes the result to zero.
2318 When extreme underflow denormalizes the result to zero.
2321 FPREM, FPREM1 ±0 rem ±0 Invalid Operation
2322 ±X rem ±0 Invalid Operation
2325 FPREM +X rem ±Y +0 Y exactly divides X
2326 -X rem ±Y -0 Y exactly divides X
2327 FPREM1 +X rem ±Y +0 Y exactly divides X
2328 -X rem ±Y -0 Y exactly divides X
2331 Compare ±0 : +X ±0 < +X
2335 +0 C{3}=1; C{2}=C{1}=C{0}=0
2336 -0 C{3}=C{1}=1; C{2}=C{0}=0
2344 FSCALE ±0 scaled by -ý *0
2345 ±0 scaled by +ý Invalid Operation
2347 FXTRACT +0 ST=+0,ST(1)=-ý, Zero divide
2348 -0 ST=-0,ST(1)=-ý, Zero divide
2365 FYL2X ±Y * log(±0) Zero Divide
2366 ±0 * log(±0) Invalid Operation
2367 FYL2XP1 +Y * log(±0+1) *0
2373 The real formats support signed representations of infinities. These values
2374 are encoded with a biased exponent of all ones and a significand of
2375 1{
\x1e}00..00; if the infinity is in a register, it is tagged special.
2377 A programmer may code an infinity, or it may be created by the NPX as its
2378 masked response to an overflow or a zero divide exception. Note that
2379 depending on rounding mode, the masked response may create the largest valid
2380 value representable in the destination rather than infinity.
2382 The signs of the infinities are observed, and comparisons are possible.
2383 Infinities are always interpreted in the affine sense; that is, -ý < (any
2384 finite number) < +ý. Arithmetic on infinities is always exact and,
2385 therefore, signals no exceptions, except for the invalid operations
2386 specified in Table 3-4.
2389 Table 3-4. Infinity Operands and Results
2392 Key to symbols used in this table
2393 X Zero or nonzero positive oprand.
2394 Y Nonzero positive operand.
2395 * Sign of original infinity operand.
2396 -* Compliment of sign of original infinity operand.
2397 $ Sign of original operand.
2398 # Sign of the original Y operand.
2399 Þ Exclusive OR of signs of operands.
2402 Operation Operands Result
2404 Addition +ý plus +ý +ý
2406 +ý plus -ý Invalid Operation
2407 -ý plus +ý Invalid Operation
2410 Subtraction +ý minus -ý +ý
2412 +ý minus +ý Invalid Operation
2413 -ý minus -ý Invalid Operation
2416 Multiplication ±ý * ±ý Þý
2418 ±0 * ±ý, ±ý * ±0 Invalid Operation
2419 Division ±ý ÷ ±ý Invalid Operation
2423 FSQRT -ý Invalid Operation
2425 FPREM, FPREM1 ±ý rem ±ý Invalid Operation
2426 ±ý rem ±X Invalid Operation
2429 FSCALE ±ý scaled by --ý Invalid Operation
2433 Sign of original zero operand.
2436 ±0 scaled by ýI Invalid Operation
2439 FXTRACT ±ý ST = *ý, ST(1) = +ý
2440 Compare +ý : +ý +ý = +ý
2462 FYL2X, FYL2XP1 ±ý * log(1) Invalid Operation
2466 ±0 * log(+ý) Invalid Operation
2467 ±Y * log(-ý) Invalid Operation
2470 3.1.4 NaN (Not-a-Number)
2472 A NaN (Not a Number) is a member of a class of special values that exists
2473 in the real formats only. A NaN has an exponent of 11..11B, may have either
2474 sign, and may have any significand except 1{
\x1e}00..00B, which is assigned to
2475 the infinities. A NaN in a register is tagged special.
2477 There are two classes of NaNs: signaling (SNaN) and quiet (QNaN). Among the
2478 QNaNs, the value real indefinite is of special interest.
2481 3.1.4.1 Signaling NaNs
2483 A signaling NaN is a NaN that has a zero as the most significant bit of its
2484 significand. The rest of the significand may be set to any value. The 80387
2485 never generates a signaling NaN as a result; however, it recognizes
2486 signaling NaNs when they appear as operands. Arithmetic operations (as
2487 defined at the beginning of this chapter) on a signaling NaN cause an
2488 invalid-operation exception (except for load operations, FXCH, FCHS, and
2491 By unmasking the invalid operation exception, the programmer can use
2492 signaling NaNs to trap to the exception handler. The generality of this
2493 approach and the large number of NaN values that are available provide the
2494 sophisticated programmer with a tool that can be applied to a variety of
2497 For example, a compiler could use signaling NaNs as references to
2498 uninitialized (real) array elements. The compiler could preinitialize each
2499 array element with a signaling NaN whose significand contained the index
2500 (relative position) of the element. If an application program attempted to
2501 access an element that it had not initialized, it would use the NaN placed
2502 there by the compiler. If the invalid operation exception were unmasked, an
2503 interrupt would occur, and the exception handler would be invoked. The
2504 exception handler could determine which element had been accessed, since the
2505 operand address field of the exception pointers would point to the NaN, and
2506 the NaN would contain the index number of the array element.
2511 A quiet NaN is a NaN that has a one as the most significant bit of its
2512 significand. The 80387 creates the quiet NaN real indefinite (defined below)
2513 as its default response to certain exceptional conditions. The 80387 may
2514 derive other QNaNs by converting an SNaN. The 80387 converts a SNaN by
2515 setting the most significant bit of its significand to one, thereby
2516 generating an QNaN. The remaining bits of the significand are not changed;
2517 therefore, diagnostic information that may be stored in these bits of the
2518 SNaN is propagated into the QNaN.
2520 The 80387 will generate the special QNaN, real indefinite, as its masked
2521 response to an invalid operation exception. This NaN is signed negative; its
2522 significand is encoded 1{
\x1e}100..00. All other NaNs represent values created
2523 by programmers or derived from values created by programmers.
2525 Both quiet and signaling NaNs are supported in all operations. A QNaN is
2526 generated as the masked response for invalid-operation exceptions and as the
2527 result of an operation in which at least one of the operands is a QNaN. The
2528 80387 applies the rules shown in Table 3-5 when generating a QNaN:
2530 Note that handling of a QNaN operand has greater priority than all
2531 exceptions except certain invalid-operation exceptions (refer to the section
2532 "Exception Priority" in this chapter).
2534 Quiet NaNs could be used, for example, to speed up debugging. In its early
2535 testing phase, a program often contains multiple errors. An exception
2536 handler could be written to save diagnostic information in memory whenever
2537 it was invoked. After storing the diagnostic data, it could supply a quiet
2538 NaN as the result of the erroneous instruction, and that NaN could point to
2539 its associated diagnostic area in memory. The program would then continue,
2540 creating a different NaN for each error. When the program ended, the NaN
2541 results could be used to access the diagnostic data saved at the time the
2542 errors occurred. Many errors could thus be diagnosed and corrected in one
2546 Table 3-5. Rules for Generating QNaNs
2550 Real operation on an SNaN and Deliver the QNaN operand.
2553 Real operation on two SNaNs Deliver the QNaN that results from
2554 converting the SNaN that has the larger
2557 Real operation on two QNaNs Deliver the QNaN that has the larger
2560 Real operation on an SNaN and Deliver the QNaN that results from
2561 another number converting the SNaN.
2563 Real operation on a QNaN and Deliver the QNaN.
2566 Invalid operation that does not Deliver the default QNaN real indefinite.
2572 For every 80387 numeric data type, one unique encoding is reserved for
2573 representing the special value indefinite. The 80387 produces this encoding
2574 as its response to a masked invalid-operation exception.
2576 In the case of reals, the indefinite value is a QNaN as discussed in the
2579 Packed decimal indefinite may be stored by the NPX in a FBSTP instruction;
2580 attempting to use this encoding in a FBLD instruction, however, will have an
2581 undefined result; thus indefinite cannot be loaded from a packed decimal
2584 In the binary integers, the same encoding may represent either indefinite
2585 or the largest negative number supported by the format (-2^(15), -2^(31), or
2586 -2^(63)). The 80387 will store this encoding as its masked response to
2587 an invalid operation, or when the value in a source register represents or
2588 rounds to the largest negative integer representable by the destination. In
2589 situations where its origin may be ambiguous, the invalid-operation
2590 exception flag can be examined to see if the value was produced by an
2591 exception response. When this encoding is loaded or used by an integer
2592 arithmetic or compare operation, it is always interpreted as a negative
2593 number; thus indefinite cannot be loaded from a binary integer.
2596 3.1.6 Encoding of Data Types
2598 Tables 3-6 through 3-9 show how each of the special values just
2599 described is encoded for each of the numeric data types. In these tables,
2600 the least-significant bits are shown to the right and are stored in the
2601 lowest memory addresses. The sign bit is always the left-most bit of the
2602 highest-addressed byte.
2605 3.1.7 Unsupported Formats
2607 The extended format permits many bit patterns that do not fall into any of
2608 the previously mentioned categories. Some of these encodings were supported
2609 by the 80287 NPX; however, most of them are not supported by the 80387 NPX.
2610 These changes are required due to changes made in the final version of the
2611 IEEE 754 standard that eliminated these data types.
2613 The categories of encodings formerly known as pseudozeros, pseudo-NaNs,
2614 pseudoinfinities, and unnormal numbers are not supported by the 80387. The
2615 80387 raises the invalid-operation exception when they are encountered as
2618 The encodings formerly known as pseudodenormal numbers are not generated by
2619 the 80387; however, they are correctly utilized when encountered in operands
2620 to 80387 instructions. The exponent is treated as if it were 00..01 and the
2621 mantissa is unchanged. The denormal exception is raised.
2624 Table 3-6. Binary Integer Encodings
2626 Class Sign Magnitude
2627 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2628 � (Largest) 0 11...11
2632 � (Smallest) 0 00...01
2633 ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2635 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2636 � (Smallest) 1 11...11
2640 � (Largest/Indefinite
2641 If this encoding is used as a source operand (as in an integer load or
2642 integer arithmetic instruction), the 80387 interprets it as the largest
2643 negative number representable in the format: -2^(15), -2^(31), or -2^(63).
2644 The 80387 will deliver this encoding to an integer destination in two
2646 1. If the result is the largest negative number
2647 2. As the response to a masked invalid operation exception, in which
2648 case it represents the special value integer indefinite.) 1 00...00
2649 ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2651 Short: ‘‘‘31 bits‘‘‘
2655 Table 3-7. Packed Decimal Encodings
2659 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Magnitude ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“
2660 Class Sign digit digit digit digit . . . digit
2661 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2662 � (Largest) 0 0000000 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 . . . 1 0 0 1
2666 � (Smallest) 0 0000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . . . 0 0 0 1
2668 � Zero 0 0000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . . . 0 0 0 0
2669 ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2670 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2671 � Zero 1 0000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . . . 0 0 0 0
2673 � (Smallest) 1 0000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . . . 0 0 0 1
2677 � (Largest) 1 0000000 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 . . . 1 0 0 1
2678 ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2680 The packed decimal indefinite encoding is stored by FBSTP in response to a
2681 masked invalid operation exception. Attempting to load this value via FBLD
2682 produces an undefined result. 1 1111111 1 1 1 1 1 1 1 1 U U U U
2683 UUUU means bit values are undefined and may contain any value U U U U . . . U U U U
2684 ‘‘‘‘ 1 byte ‘‘‘ ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ 9 bytes ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2687 Table 3-8. Single and Double Real Encodings
2691 Class Sign Exponent ff--ff
2692 Integer bit is implied and not stored.
2697 ’‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2698 � � Quiet 0 11...11 11...11
2702 � � 0 11...11 10...00
2703 � NaNs ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2704 � � Signaling 0 11...11 01...11
2708 � � 0 11...11 00...01
2709 � ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2710 � ý 0 11...11 00...00
2711 � ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2712 � � Normals 0 11...10 11...11
2716 � � 0 00...01 00...00
2717 � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2718 � Reals Denormals 0 00...00 11...11
2722 � � 0 00...00 00...01
2723 � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2724 � � Zero 0 00...00 00...00
2725 ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2726 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2727 � � Zero 1 00...00 00...00
2728 � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2729 � � Denormals 1 00...00 00...01
2733 � � 1 00...00 11...11
2734 � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2735 � � Normals 1 00...01 00...00
2739 � � 1 11...10 11...11
2740 Negatives ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2741 � ý 1 11...11 00...00
2742 � ’‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2743 � � � 1 11...11 00...01
2747 � � � 1 11...11 01...11
2748 � NaNs –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2749 � � � Indefinite 1 11...11 10...00
2750 � � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2754 � � � 1 11...11 11...11
2755 ”‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2756 Double: � ‘‘‘8 bits‘‘ � ‘‘23 bits‘‘ �
2757 Single: � ‘‘11 bits‘‘ � ‘‘52 bits‘‘ �
2760 Table 3-9. Extended Real Encodings
2764 Class Sign Exponent 1.ff--ff
2765 ’‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2766 � � 0 11...11 1 11..11
2769 � � 0 11...11 1 10..01
2770 � NaNs ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2771 � � 0 11...11 1 01..11
2774 � � 0 11...11 1 00..01
2775 � ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2776 � ý 0 11...11 1 00..00
2777 � ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2778 � � 0 11...10 1 11..11
2781 � � 0 00...01 1 00..00
2782 � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2783 Positives � 0 11...10 0 11..11
2784 � Reals Unsupported ¨ ¨ ¨
2785 � � 8087 Unnormals ¨ ¨ ¨
2786 � � 0 00...01 0 00..00
2787 � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2788 � � 0 00...00 1 11..11
2791 � � 0 00...00 1 00..00
2792 � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2793 � � 0 00...00 0 11..11
2796 � � 0 00...00 0 00..01
2797 � –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2798 � � Zero 0 00...00 000...00
2799 ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2800 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2801 � � Zero 1 00...00 000...00
2802 � –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2803 � � 1 00...00 0 00..01
2806 � � 1 00...00 0 11..11
2807 � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2808 � � 1 00...00 1 00..00s
2809 � Reals Pseudo- ¨ ¨ ¨
2811 � � 1 00...00 1 11..11
2812 � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2813 � � 1 00...00 0 00..00
2814 Negatives � Unsupported ¨ ¨ ¨
2815 � � 8087 Unnormals ¨ ¨ ¨
2816 � � 1 11...10 0 11..11
2817 � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2818 � � 1 00...01 1 00..00
2821 � � 1 11...10 1 11..11
2822 � ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2823 � ý 1 11...11 1 00..00
2824 � ’‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2825 � � � 1 11...11 1 00..01
2828 � � � 1 11...11 1 01..11
2829 � � –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2830 � NaNs � Indefinite 1 11...11 110...00
2831 � � � ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2832 � � � 1 11...11 1 10..00
2835 � � � 1 11...11 1 11..11
2836 ”‘‘‘‘‘‘™‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2837 �‘‘15 bits‘‘�‘‘64 bits‘‘�
2840 3.2 Numeric Exceptions
2842 The 80387 can recognize six classes of numeric exception conditions while
2843 executing numeric instructions:
2845 1. I‘‘ Invalid operation
2847 Ž IEEE standard invalid operation
2848 2. Z‘‘ Divide-by-zero
2849 3. D‘‘ Denormalized operand
2850 4. O‘‘ Numeric overflow
2851 5. U‘‘ Numeric underflow
2852 6. P‘‘ Inexact result (precision)
2855 3.2.1 Handling Numeric Exceptions
2857 When numeric exceptions occur, the NPX takes one of two possible courses of
2860 Ž The NPX can itself handle the exception, producing the most reasonable
2861 result and allowing numeric program execution to continue undisturbed.
2863 Ž A software exception handler can be invoked by the CPU to handle the
2866 Each of the six exception conditions described above has a corresponding
2867 flag bit in the 80387 status word and a mask bit in the 80387 control word.
2868 If an exception is masked (the corresponding mask bit in the control
2869 word = 1), the 80387 takes an appropriate default action and continues with
2870 the computation. If the exception is unmasked (mask = 0), the 80387 asserts
2871 the ERROR# output to the 80386 to signal the exception and invoke a software
2874 Note that when exceptions are masked, the NPX may detect multiple
2875 exceptions in a single instruction, because it continues executing the
2876 instruction after performing its masked response. For example, the 80387
2877 could detect a denormalized operand, perform its masked response to this
2878 exception, and then detect an underflow.
2881 3.2.1.1 Automatic Exception Handling
2883 The 80387 NPX has a default fix-up activity for every possible exception
2884 condition it may encounter. These masked-exception responses are designed to
2885 be safe and are generally acceptable for most numeric applications.
2887 As an example of how even severe exceptions can be handled safely and
2888 automatically using the NPX's default exception responses, consider a
2889 calculation of the parallel resistance of several values using only the
2890 standard formula (Figure 3-3). If R{1} becomes zero, the circuit resistance
2891 becomes zero. With the divide-by-zero and precision exceptions masked, the
2892 80387 NPX will produce the correct result.
2894 By masking or unmasking specific numeric exceptions in the NPX control
2895 word, NPX programmers can delegate responsibility for most exceptions to the
2896 NPX, reserving the most severe exceptions for programmed exception handlers.
2897 Exception-handling software is often difficult to write, and the NPX's
2898 masked responses have been tailored to deliver the most reasonable result
2899 for each condition. For the majority of applications, masking all
2900 exceptions other than invalid-operation yields satisfactory results with the
2901 least programming effort. An invalid-operation exception normally indicates
2902 a program error that must be corrected; this exception should not normally
2905 The exception flags in the NPX status word provide a cumulative record of
2906 exceptions that have occurred since these flags were last cleared. Once set,
2907 these flags can be cleared only by executing the FCLEX (clear exceptions)
2908 instruction, by reinitializing the NPX, or by overwriting the flags with an
2909 FRSTOR or FLDENV instruction. This allows a programmer to mask all
2910 exceptions (except invalid operation), run a calculation, and then inspect
2911 the status word to see if any exceptions were detected at any point in the
2915 Figure 3-3. Arithmetic Example Using Infinity
2928 EQUIVALENT RESISTANCE = ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2929 1/R{1} + 1/R{2} + 1/R{3}
2932 3.2.1.2 Software Exception Handling
2934 If the NPX encounters an unmasked exception condition, it signals the
2935 exception to the 80386 CPU using the ERROR# status line between the two
2938 The next time the 80386 CPU encounters a WAIT or ESC instruction in its
2939 instruction stream, the 80386 will detect the active condition of the ERROR#
2940 status line and automatically trap to an exception response routine using
2941 interrupt #16, the "processor extension error" exception.
2943 This exception response routine is normally a part of the systems software.
2944 Typical exception responses may include:
2946 Ž Incrementing an exception counter for later display or printing
2948 Ž Printing or displaying diagnostic information (e.g., the 80387
2949 environment andregisters)
2951 Ž Aborting further execution
2953 Ž Using the exception pointers to build an instruction that will run
2954 without exception and executing it
2956 For 80386 systems having systems software support for the 80387 NPX,
2957 applications programmers should consult the operating system's reference
2958 manuals for the appropriate system response to NPX exceptions. For systems
2959 programmers, specific details on writing software exception handlers are
2960 included in Chapter 6.
2963 3.2.2 Invalid Operation
2965 This exception may occur in response to two general classes of operations:
2968 2. Arithmetic operations
2970 The stack flag (SF) of the status word indicates which class of operation
2971 caused the exception. When SF is 1 a stack operation has resulted in stack
2972 overflow or underflow; when SF is 0, an arithmetic instruction has
2973 encountered an invalid operand.
2976 3.2.2.1 Stack Exception
2978 When SF is 1, indicating a stack operation, the O/U# bit of the condition
2979 code (bit C{1}) distinguishes between stack overflow and underflow as
2982 O/U# = 1 Stack overflow‘‘ an instruction attempted to push down a
2983 nonempty stack location.
2985 O/U# = 0 Stack underflow‘‘ an instruction attempted to read an
2986 operand from an empty stack location.
2988 When the invalid-operation exception is masked, the 80387 returns the QNaN
2989 indefinite. This value overwrites the destination register, destroying
2990 its original contents.
2992 When the invalid-operation exception is not masked, the 80386 exception
2993 "processor extension error" is triggered. TOP is not changed, and the source
2994 operands remain unaffected.
2997 3.2.2.2 Invalid Arithmetic Operation
2999 This class includes the invalid operations defined in IEEE Std 754. The
3000 80387 reports an invalid operation in any of the cases shown in Table 3-10.
3001 Also shown in this table are the 80387's responses when the invalid
3002 exception is masked. When unmasked, the 80386 exception "processor extension
3003 error" is triggered, and the operands remain unaltered. An invalid operation
3004 generally indicates a program error.
3007 Table 3-10. Masked Responses to Invalid Operations
3010 Condition Masked Response
3012 Any arithmetic operation Return the QNaN indefinite.
3013 on an unsupported format.
3015 Any arithmetic operation Return a QNaN (refer to the section
3016 on a signaling NaN. "Rules for Generating QNaNs").
3018 Compare and test operations: Set condition codes "not comparable."
3019 one or both operands is a NaN.
3021 Addition of opposite-signed Return the QNaN indefinite.
3022 infinities or subtraction of
3023 like-signed infinities.
3025 Multiplication: ý * 0; or 0 * ý. Return the QNaN indefinite.
3027 Division: ý ÷ ý; or 0 ÷ 0. Return the QNaN indefinite.
3029 Remainder instructions FPREM, Return the QNaN indefinite; set C{2}.
3030 FPREM1 when modulus (divisor)
3031 is zero or dividend is ý.
3033 Trigonometric instructions FCOS, Return the QNaN indefinite; set C{2}.
3034 FPTAN, FSIN, FSINCOS when
3037 FSQRT of negative operand (except Return the QNaN indefinite.
3038 FSQRT (-0) = -0), FYL2X of
3039 negative operand (except FYL2X
3040 (-0) = -ý), FYL2XP1 of operand
3041 more negative than -1.
3043 FIST(P) instructions when source Store integer indefinite.
3044 register is empty, a NaN, ý,
3045 or exceeds representable range
3048 FBSTP instruction when source Store packed decimal indefinite.
3049 register is empty, a NaN, ý, or
3050 exceeds 18 decimal digits.
3052 FXCH instruction when one or Change empty registers to the QNaN
3053 both registers are tagged empty. indefinite and then perform exchange.
3056 3.2.3 Division by Zero
3058 If an instruction attempts to divide a finite nonzero operand by zero, the
3059 80387 will report a zero-divide exception. This is possible for
3060 F(I)DIV(R)(P) as well as the other instructions that perform division
3061 internally: FYL2X and FXTRACT. The masked response for FDIV and FYL2X is to
3062 return an infinity signed with the exclusive OR of the signs of the
3063 operands. For FXTRACT, ST(1) is set to -ý; ST is set to zero with the same
3064 sign as the original operand. If the divide-by-zero exception is unmasked,
3065 the 80386 exception "processor extension error" is triggered; the operands
3069 3.2.4 Denormal Operand
3071 If an arithmetic instruction attempts to operate on a denormal operand, the
3072 NPX reports the denormal-operand exception. Denormal operands may have
3073 reduced significance due to lost low-order bits, therefore it may be
3074 advisable in certain applications to preclude operations on these operands.
3075 This can be accomplished by an exception handler that responds to unmasked
3076 denormal exceptions. Most users will mask this exception so that
3077 computation may proceed; any loss of accuracy will be analyzed by the user
3078 when the final result is delivered.
3080 When this exception is masked, the 80387 sets the D-bit in the status word,
3081 then proceeds with the instruction. Gradual underflow and denormal numbers
3082 as handled on the 80387 will produce results at least as good as, and often
3083 better than what could be obtained from a machine that flushes underflows to
3084 zero. In fact, a denormal operand in single- or double-precision format will
3085 be normalized to the extended-real format when loaded into the 80387.
3086 Subsequent operations will benefit from the additional precision of the
3087 extended-real format used internally.
3089 When this exception is not masked, the D-bit is set and the exception
3090 handler is invoked. The operands are not changed by the instruction and are
3091 available for inspection by the exception handler.
3093 If an 8087/80287 program uses the denormal exception to automatically
3094 normalize denormal operands, then that program can run on an 80387 by
3095 masking the denormal exception. The 8087/80287 denormal exception handler
3096 would not be used by the 80387 in this case. A numerics program runs faster
3097 when the 80387 performs normalization of denormal operands. A program can
3098 detect at run-time whether it is running on an 80387 or 8087/80287 and
3099 disable the denormal exception when an 80387 is used. The following code
3100 sequence is recommended to distinguish between an 80387 and an 8087/80287.
3102 FINIT ; Use default infinity mode:
3103 ; projective for 8087/80287,
3105 FLD1 ; Generate infinty
3109 ; Form negative infinity
3111 FCOMPP ; Compare +infinity with -infinity
3112 FSTSW temp ; 8087/80287 will say they are equal
3117 The denormal-operand exception of the 80387 permits emulation of arithmetic
3118 on unnormal operands as provided by the 8087/80287. The standard does not
3119 require the denormal exception nor does it recognize the unnormal data type.
3122 3.2.5 Numeric Overflow and Underflow
3124 If the exponent of a numeric result is too large for the destination real
3125 format, the 80387 signals a numeric overflow. Conversely, if the exponent of
3126 a result is too small to be represented in the destination format, a numeric
3127 underflow is signaled. If either of these exceptions occur, the result of
3128 the operation is outside the range of the destination real format.
3130 Typical algorithms are most likely to produce extremely large and small
3131 numbers in the calculation of intermediate, rather than final, results.
3132 Because of the great range of the extended-precision format (recommended as
3133 the destination format for intermediates), overflow and underflow are
3134 relatively rare events in most 80387 applications.
3139 The overflow exception can occur whenever the rounded true result would
3140 exceed in magnitude the largest finite number in the destination format. The
3141 exception can occur in the execution of most of the arithmetic instructions
3142 and in some of the conversion instructions; namely, FST(P), F(I)ADD(P),
3143 F(I)SUB(R)(P), F(I)MUL(P), FDIV(R)(P), FSCALE, FYL2X, and FYL2XP1.
3145 The response to an overflow condition depends on whether the overflow
3146 exception is masked:
3148 Ž Overflow exception masked. The value returned depends on the rounding
3149 mode as Table 3-11 illustrates.
3151 Ž Overflow exception not masked. The unmasked response depends on
3152 whether the instruction is supposed to store the result on the stack
3155 ‘‘ Destination is the stack. The true result is divided by 2^(24,576)
3156 and rounded. (The bias 24,576 is equal to 3 * 2^(13).) The
3157 significand is rounded to the appropriate precision (according to
3158 the precision control (PC) bit of the control word, for those
3159 instructions controlled by PC, otherwise to extended precision).
3160 The roundup bit (C{1}) of the status word is set if the
3161 significand was rounded upward.
3163 The biasing of the exponent by 24,576 normally translates the
3164 number as nearly as possible to the middle of the exponent range
3165 so that, if desired, it can be used in subsequent scaled
3166 operations with less risk of causing further exceptions. With the
3167 instruction FSCALE, however, it can happen that the result is too
3168 large and overflows even after biasing. In this case, the unmasked
3169 response is exactly the same as the masked round-to-nearest
3170 response, namely ± infinity. The intention of this feature is to
3171 ensure the trap handler will discover that a translation of the
3172 exponent by -24574 would not work correctly without obliging the
3173 programmer of Decimal-to-Binary or Exponential functions to
3174 determine which trap handler, if any, should be invoked.
3176 ‘‘ Destination is memory (this can occur only with the store
3177 instructions). No result is stored in memory. Instead, the operand
3178 is left intact in the stack. Because the data in the stack is in
3179 extended-precision format, the exception handler has the option
3180 either of reexecuting the store instruction after proper
3181 adjustment of the operand or of rounding the significand on the
3182 stack to the destination's precision as the standard requires. The
3183 exception handler should ultimately store a value into the
3184 destination location in memory if the program is to continue.
3187 Table 3-11. Masked Overflow Results
3190 Mode True Result Result
3194 Toward -ý + Largest finite positive number
3197 - Largest finite negative number
3198 Toward zero + Largest finite positive number
3199 - Largest finite negative number
3204 Underflow can occur in the execution of the instructions FST(P), FADD(P),
3205 FSUB(RP), FMUL(P), F(I)DIV(RP), FSCALE, FPREM(1), FPTAN, FSIN, FCOS,
3206 FSINCOS, FPATAN, F2XM1, FYL2X, and FYL2XP1.
3208 Two related events contribute to underflow:
3210 1. Creation of a tiny result which, because it is so small, may cause
3211 some other exception later (such as overflow upon division).
3213 2. Creation of an inexact result; i.e. the delivered result differs from
3214 what would have been computed were both the exponent range and
3215 precision unbounded.
3217 Which of these events triggers the underflow exception depends on whether
3218 the underflow exception is masked:
3220 1. Underflow exception masked. The underflow exception is signaled when
3221 the result is both tiny and inexact.
3223 2. Underflow exception not masked. The underflow exception is signaled
3224 when the result is tiny, regardless of inexactness.
3226 The response to an underflow exception also depends on whether the
3227 exception is masked:
3229 1. Masked response. The result is denormal or zero. The precision
3230 exception is also triggered.
3232 2. Unmasked response. The unmasked response depends on whether the
3233 instruction is supposed to store the result on the stack or in memory:
3235 Ž Destination is the stack. The true result is multiplied by
3236 2^(24,576) and rounded. (The bias 24,576 is equal to 3 * 2^(13).)
3237 The significand is rounded to the appropriate precision (according
3238 to the precision control (PC) bit of the control word, for those
3239 instructions controlled by PC, otherwise to extended precision).
3240 The roundup bit (C{1}) of the status word is set if the significand
3243 The biasing of the exponent by 24,576 normally translates the
3244 number as nearly as possible to the middle of the exponent range so
3245 that, if desired, it can be used in subsequent scaled operations
3246 with less risk of causing further exceptions. With the instruction
3247 FSCALE, however, it can happen that the result is too tiny and
3248 underflows even after biasing. In this case, the unmasked response
3249 is exactly the same as the masked round-to-nearest response, namely
3250 ±0. The intention of this feature is to ensure the trap handler
3251 will discover that a translation by +24576 would not work correctly
3252 without obliging the programmer of Decimal-to-Binary or Exponential
3253 functions to determine which trap handler, if any, should be
3256 Ž Destination is memory (this can occur only with the store
3257 instructions). No result is stored in memory. Instead, the operand
3258 is left intact in the stack. Because the data in the stack is in
3259 extended-precision format, the exception handler has the option
3260 either of reexecuting the store instruction after proper adjustment
3261 of the operand or of rounding the significand on the stack to the
3262 destination's precision as the standard requires. The exception
3263 handler should ultimately store a value into the destination
3264 location in memory if the program is to continue.
3267 3.2.6 Inexact (Precision)
3269 This exception condition occurs if the result of an operation is not
3270 exactly representable in the destination format. For example, the fraction
3271 1/3 cannot be precisely represented in binary form. This exception occurs
3272 frequently and indicates that some (generally acceptable) accuracy has been
3275 All the transcendental instructions are inexact by definition; they always
3276 cause the inexact exception.
3278 The C{1} (roundup) bit of the status word indicates whether the inexact
3279 result was rounded up (C{1} = 1) or chopped (C{1} = 0).
3281 The inexact exception accompanies the underflow exception when there is
3282 also a loss of accuracy. When underflow is masked, the underflow exception
3283 is signaled only when there is a loss of accuracy; therefore the precision
3284 flag is always set as well. When underflow is unmasked, there may or may not
3285 have been a loss of accuracy; the precision bit indicates which is the case.
3287 This exception is provided for applications that need to perform exact
3288 arithmetic only. Most applications will mask this exception. The 80387
3289 delivers the rounded or over/underflowed result to the destination,
3290 regardless of whether a trap occurs.
3293 3.2.7 Exception Priority
3295 The 80387 deals with exceptions according to a predetermined precedence.
3296 Precedence in exception handling means that higher-priority exceptions are
3297 flagged and results are delivered according to the requirements of that
3298 exception. Lower-priority exceptions may not be flagged even if they occur.
3299 For example, dividing an SNaN by zero causes an invalid-operand exception
3300 (due to the SNaN) and not a zero-divide exception; the masked result is the
3301 QNaN real indefinite, not ý. A denormal or inexact (precision) exception,
3302 however, can accompany a numeric underflow or overflow exception.
3304 The exception precedence is as follows:
3306 1. Invalid operation exception, subdivided as follows:
3310 c. Operand of unsupported format.
3313 2. QNaN operand. Though this is not an exception, if one operand is a
3314 QNaN, dealing with it has precedence over lower-priority exceptions.
3315 For example, a QNaN divided by zero results in a QNaN, not a
3316 zero-divide exception.
3318 3. Any other invalid-operation exception not mentioned above or zero
3321 4. Denormal operand. If masked, then instruction execution continues,
3322 and a lower-priority exception can occur as well.
3324 5. Numeric overflow and underflow. Inexact result (precision) can be
3327 6. Inexact result (precision).
3330 3.2.8 Standard Underflow/Overflow Exception Handler
3332 As long as the underflow and overflow exceptions are masked, no additional
3333 software is required to cause the output of the 80387 to conform to the
3334 requirements of IEEE Std 754. When unmasked, these exceptions give the
3335 exception handler an additional option in the case of store instructions. No
3336 result is stored in memory; instead, the operand is left intact on the
3337 stack. The handler may round the significand of the operand on the stack to
3338 the destination's precision as the standard requires, or it may adjust the
3339 operand and reexecute the faulting instruction.
3343 Chapter 4 The 80387 Instruction Set
3345 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
3347 This chapter describes the operation of all 80387 instructions. Within this
3348 section, the instructions are divided into six functional classes:
3350 Ž Data Transfer instructions
3351 Ž Nontranscendental instructions
3352 Ž Comparison instructions
3353 Ž Transcendental instructions
3354 Ž Constant instructions
3355 Ž Processor Control instructions
3357 Throughout this chapter, the instruction set is described as it appears to
3358 the ASM386 programmer who is coding a program. Not included in this chapter
3359 are details of instruction format, encoding, and execution times. This
3360 detailed information may be found in Appendix A. Refer also to Appendix B
3361 for a summary of the exceptions caused by each instruction.
3364 4.1 Compatibility With the 80287 and 8087
3366 The instruction set for the 80387 NPX is largely the same as that for the
3367 80287 NPX (used with 80286 systems) and that for the 8087 NPX (used with
3368 8086 and 8088 systems). Most object programs generated for the 80287 or 8087
3369 will execute without change on the 80387. Several instructions are new to
3370 the 80387, and several 80287 and 8087 instructions perform no useful
3371 function on the 80387. Appendix C and Appendix D give details of these
3372 instruction set differences.
3375 4.2 Numeric Operands
3377 The typical NPX instruction accepts one or two operands as inputs, operates
3378 on these, and produces a result as an output. An operand is most often the
3379 contents of a register or of a memory location. The operands of some
3380 instructions are predefined; for example, FSQRT always takes the square root
3381 of the number in the top NPX stack element. Others allow, or require, the
3382 programmer to explicitly code the operand(s) along with the instruction
3383 mnemonic. Still others accept one explicit operand and one implicit
3384 operand, which is usually the top NPX stack element. All 80387 instructions
3385 that have a data operand use ST as one operand or as the only operand.
3387 Whether supplied by the programmer or utilized automatically, the two basic
3388 types of operands are sources and destinations. A source operand simply
3389 supplies one of the inputs to an instruction; it is not altered by the
3390 instruction. Even when an instruction converts the source operand from one
3391 format to another (e.g., real to integer), the conversion is actually
3392 performed in an internal work area to avoid altering the source operand. A
3393 destination operand may also provide an input to an instruction. It is
3394 distinguished from a source operand, however, because its content may be
3395 altered when it receives the result produced by the operation; that is, the
3396 destination is replaced by the result.
3398 Many instructions allow their operands to be coded in more than one way.
3399 For example, FADD (add real) may be written without operands, with only a
3400 source or with a destination and a source. The instruction descriptions in
3401 this section employ the simple convention of separating alternative operand
3402 forms with slashes; the slashes, however, are not coded. Consecutive slashes
3403 indicate an option of no explicit operands. The operands for FADD are thus
3406 //source/destination, source
3408 This means that FADD may be written in any of three ways:
3412 FADD Add ST to ST(1), put result in ST(1), then pop ST
3413 FADD source Add source to ST(0)
3414 FADD destination, source Add source to destination
3416 The assembler can allow the same instruction to be specified in different
3419 FADD = FADDP ST(1), ST
3420 FADD ST(1) = FADD ST, ST(1)
3422 When reading this section, it is important to bear in mind that memory
3423 operands may be coded with any of the CPU's memory addressing methods
3424 provided by the ModR/M byte. To review these methods (BASE + (INDEX * SCALE)
3425 + DISPLACEMENT) refer to the 80386 Programmer's Reference Manual.
3426 Chapter 5 also provides several addressing mode examples.
3429 4.3 Data Transfer Instructions
3431 These instructions (summarized in Table 4-1) move operands among elements
3432 of the register stack, and between the stack top and memory. Any of the
3433 seven data types can be converted to extended real and loaded (pushed) onto
3434 the stack in a single operation; they can be stored to memory in the same
3435 manner. The data transfer instructions automatically update the 80387 tag
3436 word to reflect whether the register is empty or full following the
3439 Table 4-1. Data Transfer Instructions
3444 FSTP Store real and pop
3445 FXCH Exchange registers
3449 FISTP Integer store and pop
3450 Packed Decimal Transfers
3451 FBLD Packed decimal (BCD) load
3452 FBSTP Packed decimal (BCD) store and pop
3457 FLD (load real) loads (pushes) the source operand onto the top of the
3458 register stack. This is done by decrementing the stack pointer by one and
3459 then copying the content of the source to the new stack top. ST(7) must be
3460 empty to avoid causing an invalid-operation exception. The new stack top is
3461 tagged nonempty. The source may be a register on the stack (ST(i)) or any of
3462 the real data types in memory. If the source is a register, the register
3463 number used is that before TOP is decremented by the instruction. Coding FLD
3464 ST(0) duplicates the stack top. Single and double real source operands are
3465 converted to extended real automatically. Loading an extended real operand
3466 does not require conversion; therefore, the I and D exceptions do not occur
3470 4.3.2 FST destination
3472 FST (store real) copies the NPX stack top to the destination, which
3473 may be another register on the stack or a single or double (but not
3474 extended-precision) memory operand. If the destination is single or double
3475 real, the copy of the significand is rounded to the width of the destination
3476 according to the RC field of the control word, and the copy of the exponent
3477 is converted to the width and bias of the destination format. The
3478 over/underflow condition is checked for as well.
3480 If, however, the stack top contains zero, ±ý, or a NaN, then the stack
3481 top's significand is not rounded but is chopped (on the right) to fit the
3482 destination. Neither is the exponent converted, rather it also is chopped on
3483 the right and transferred "as is". This preserves the value's identification
3484 as ý or a NaN (exponent all ones) so that it can be properly loaded and used
3485 later in the program if desired.
3487 Note that the 80387 does not signal the invalid-operation exception when
3488 the destination is a nonempty stack element.
3491 4.3.3 FSTP destination
3493 FSTP (store real and pop) operates identically to FST except that the NPX
3494 stack is popped following the transfer. This is done by tagging the top
3495 stack element empty and then incrementing TOP. FSTP also permits storing to
3496 an extended-precision real memory variable, whereas FST does not. If the
3497 source operand is a register, the register number used is that before TOP is
3498 incremented by the instruction. Coding FSTP ST(0) is equivalent to popping
3499 the stack with no data transfer.
3502 4.3.4 FXCH //destination
3504 FXCH (exchange registers) swaps the contents of the destination and the
3505 stack top registers. If the destination is not coded explicitly, ST(1) is
3506 used. Many 80387 instructions operate only on the stack top; FXCH provides a
3507 simple means of effectively using these instructions on lower stack
3508 elements. For example, the following sequence takes the square root of the
3509 third register from the top (assuming that ST is nonempty):
3518 FILD (integer load) converts the source memory operand from its binary
3519 integer format (word, short, or long) to extended real and pushes the result
3520 onto the NPX stack. ST(7) must be empty to avoid causing an exception. The
3521 (new) stack top is tagged nonempty. FILD is an exact operation; the source
3522 is loaded with no rounding error.
3525 4.3.6 FIST destination
3527 FIST (integer store) stores the content of the stack top to an integer
3528 according to the RC field (rounding control) of the control word and
3529 transfers the result to the destination, leaving the stack top unchanged.
3530 The destination may define a word or short integer variable. Negative zero
3531 is stored in the same encoding as positive zero: 0000...00.
3534 4.3.7 FISTP destination
3536 FISTP (integer and pop) operates like FIST except that it also pops the NPX
3537 stack following the transfer. The destination may be any of the binary
3543 FBLD (packed decimal (BCD) load) converts the content of the source operand
3544 from packed decimal to extended real and pushes the result onto the NPX
3545 stack. ST(7) must be empty to avoid causing an exception. The sign of the
3546 source is preserved, including the case where the value is negative zero.
3547 FBLD is an exact operation; the source is loaded with no rounding error.
3549 The packed decimal digits of the source are assumed to be in the range 0-9.
3550 The instruction does not check for invalid digits (A-FH), and the result of
3551 attempting to load an invalid encoding is undefined.
3554 4.3.9 FBSTP destination
3556 FBSTP (packed decimal (BCD) store and pop) converts the content of the
3557 stack top to a packed decimal integer, stores the result at the destination
3558 in memory, and pops the stack. FBSTP rounds a nonintegral value according to
3559 the RC (rounding control) field of the control word.
3562 4.4 Nontranscendental Instructions
3564 The 80387's nontranscendental instruction set (Table 4-2) provides a wealth
3565 of variations on the basic add, subtract, multiply, and divide operations,
3566 and a number of other useful functions. These range from a simple absolute
3567 value to a square root instruction that executes faster than ordinary
3568 division; 80387 programmers no longer need to spend valuable time
3569 eliminating square roots from algorithms because they run too slowly. Other
3570 nontranscendental instructions perform exact modulo division, round real
3571 numbers to integers, and scale values by powers of two.
3573 The 80387's basic nontranscendental instructions (addition, subtraction,
3574 multiplication, and division) are designed to encourage the development of
3575 very efficient algorithms. In particular, they allow the programmer to
3576 reference memory as easily as the NPX register stack.
3578 Table 4-3 summarizes the available operation/operand forms that are
3579 provided for basic arithmetic. In addition to the four normal operations,
3580 two "reversed" instructions make subtraction and division "symmetrical" like
3581 addition and multiplication. The variety of instruction and operand forms
3582 give the programmer unusual flexibility:
3584 Ž Operands may be located in registers or memory.
3586 Ž Results may be deposited in a choice of registers.
3588 Ž Operands may be a variety of NPX data types: extended real, double
3589 real, single real, short integer or word integer, with automatic
3590 conversion to extended real performed by the 80387.
3592 Five basic instruction forms may be used across all six operations, as
3593 shown in Table 4-3. The classical stack form may be used to make the 80387
3594 operate like a classical stack machine. No operands are coded in this form,
3595 only the instruction mnemonic. The NPX picks the source operand from the
3596 stack top and the destination from the next stack element. It then pops the
3597 stack, performs the operation, and returns the result to the new stack top,
3598 effectively replacing the operands by the result.
3600 The register form is a generalization of the classical stack form; the
3601 programmer specifies the stack top as one operand and any register on the
3602 stack as the other operand. Coding the stack top as the destination provides
3603 a convenient way to access a constant, held elsewhere in the stack, from the
3604 stack top. The destination need not always be ST, however. All two operand
3605 instructions allow use of another register as the destination. This coding
3606 (ST is the source operand) allows, for example, adding the stack top into a
3607 register used as an accumulator.
3609 Often the operand in the stack top is needed for one operation but then is
3610 of no further use in the computation. The register pop form can be used to
3611 pick up the stack top as the source operand, and then discard it by popping
3612 the stack. Coding operands of ST(1), ST with a register pop mnemonic is
3613 equivalent to a classical stack operation: the top is popped and the result
3614 is left at the new top.
3616 The two memory forms increase the flexibility of the 80387's
3617 nontranscendental instructions. They permit a real number or a binary
3618 integer in memory to be used directly as a source operand. This is useful in
3619 situations where operands are not used frequently enough to justify holding
3620 them in registers. Note that any memory addressing method may be used to
3621 define these operands, so they may be elements in arrays, structures, or
3622 other data organizations, as well as simple scalars.
3624 The six basic operations are discussed further in the next paragraphs, and
3625 descriptions of the remaining seven operations follow.
3628 Table 4-2. Nontranscendental Instructions
3632 FADDP Add real and pop
3637 FSUBP Subtract real and pop
3638 FISUB Integer subtract
3639 FSUBR Subtract real reversed
3640 FSUBRP Subtract real reversed and pop
3641 FISUBR Integer subtract reversed
3645 FMULP Multiply real and pop
3646 FIMUL Integer multiply
3650 FDIVP Divide real and pop
3651 FIDIV Integer divide
3652 FDIVR Divide real reversed
3653 FDIVRP Divide real reversed and pop
3654 FIDIVR Integer divide reversed
3659 FPREM Partial remainder
3660 FPREM1 IEEE standard partial remainder
3661 FRNDINT Round to integer
3662 FXTRACT Extract exponent and significand
3667 Table 4-3. Basic Nontranscendental Instructions and Operands
3669 Instruction Form Mnemonic Operand Forms
3670 Form destination, source ASM386 Example
3672 Classical stack Fop [ST(1), ST] FADD
3673 Classical stack, extra pop FopP [ST(1), ST] FADDP
3674 Register Fop ST(i), ST or ST, ST(i) FSUB ST, ST(3)
3675 Register pop FopP ST(i), ST FMULP ST(2), ST
3676 Real memory Fop [ST,] single/double FDIV AZIMUTH
3677 Integer memory FIop [ST,] word-integer/ FIDIV PULSES
3680 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
3682 Brackets ([]) surround implicit operands; these are not coded, and are
3683 shown here for information only.
3685 op= ADD destination
\e destination + source
3686 SUB destination
\e destination - source
3687 SUBR destination
\e source - destination
3688 MUL destination
\e destination * source
3689 DIV destination
\e destination ÷ source
3690 DIVR destination
\e source ÷ destination
3691 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
3696 FADD //source/destination,source
3697 FADDP //destination,source
3700 The addition instructions (add real, add real and pop, integer add) add the
3701 source and destination operands and return the sum to the destination. The
3702 operand at the stack top may be doubled by coding:
3706 If the source operand is in memory, conversion of an integer, a single
3707 real, or a double real operand to extended real is performed automatically.
3710 4.4.2 Normal Subtraction
3712 FSUB //source/destination,source
3713 FSUBP //destination,source
3716 The normal subtraction instructions (subtract real, subtract real and pop,
3717 integer subtract) subtract the source operand from the destination and
3718 return the difference to the destination.
3721 4.4.3 Reversed Subtraction
3723 FSUBR //source/destination,source
3724 FSUBRP //destination,source
3727 The reversed subtraction instructions (subtract real reversed, subtract
3728 real reversed and pop, integer subtract reversed) subtract the destination
3729 from the source and return the difference to the destination. For example,
3730 FSUBR ST, ST(1) means subtract ST from ST(1) and leave the result in ST.
3733 4.4.4 Multiplication
3735 FMUL //source/destination,source
3736 FMULP //destination,source
3739 The multiplication instructions (multiply real, multiply real and pop,
3740 integer multiply) multiply the source and destination operands and return
3741 the product to the destination. Coding FMUL ST, ST(0) squares the content of
3745 4.4.5 Normal Division
3747 FDIV //source/destination,source
3748 FDIVP //destination,source
3751 The normal division instructions (divide real, divide real and pop, integer
3752 divide) divide the destination by the source and return the quotient to the
3756 4.4.6 Reversed Division
3758 FDIVR //source/destination,source
3759 FDIVRP //destination,source
3762 The reversed division instructions (divide real reversed, divide real
3763 reversed and pop, integer divide reversed) divide the source operand by the
3764 destination and return the quotient to the destination.
3769 FSQRT (square root) replaces the content of the top stack element with its
3770 square root. (Note: The square root of -0 is defined to be -0.)
3775 FSCALE (scale) interprets the value contained in ST(1) as an integer and
3776 adds this value to the exponent of the number in ST. This is equivalent to
3778 ST
\e ST * 2^(ST(1))
3780 Thus, FSCALE provides rapid multiplication or division by integral powers
3781 of 2. It is particularly useful for scaling the elements of a vector.
3783 There is no limit on the range of the scale factor in ST(1). If the value
3784 is not integral, FSCALE uses the nearest integer smaller in magnitude; i.e.,
3785 it chops the value toward 0. If the resulting integer is zero, the value in
3789 4.4.9 FPREM ‘‘ Partial Remainder (80287/8087-Compatible)
3791 FPREM computes the remainder of division of ST by ST(1) and leaves the
3792 result in ST. FPREM finds a remainder REM and a quotient Q such that
3796 The quotient Q is chosen to be the integer obtained by chopping the exact
3797 value of ST/ST(1) toward zero. The sign of the remainder is the same as the
3798 sign of the original dividend from ST.
3800 By ignoring precision control, the 80387 produces an exact result with
3801 FPREM. The precision (inexact) exception does not occur and the rounding
3802 control has no effect.
3804 The FPREM instruction is not the remainder operation specified in the IEEE
3805 standard. To get that remainder, the FPREM1 instruction should be used.
3807 The FPREM instruction is designed to be executed iteratively in a
3808 software-controlled loop. It operates by performing successive scaled
3809 subtractions; therefore, obtaining the exact remainder when the operands
3810 differ greatly in magnitude can consume large amounts of execution time.
3811 Because the 80387 can only be preempted between instructions, the remainder
3812 function could seriously increase interrupt latency in these cases. For
3813 this reason, the maximum number of iterations is limited. The instruction
3814 may terminate before it has completely terminated the calculation. The C2
3815 bit of the status word indicates whether the calculation is complete or
3816 whether the instruction must be executed again.
3818 FPREM can reduce the exponent of ST by up to (but not including) 64 in one
3819 execution. If FPREM produces a remainder that is less than the modulus
3820 (i.e., the divisor), the function is complete and bit C2 of the status word
3821 condition code is cleared. If the function is incomplete, C2 is set to 1;
3822 the result in ST is then called the partial remainder. Software can inspect
3823 C2 by storing the status word following execution of FPREM, reexecuting the
3824 instruction (using the partial remainder in ST as the dividend) until C2 is
3825 cleared. A higher priority interrupting routine that needs the 80387 can
3826 force a context switch between the instructions in the remainder loop.
3828 An important use for FPREM is to reduce arguments (operands) of
3829 transcendental functions to the range permitted by these instructions. For
3830 example, the FPTAN (tangent) instruction requires its argument ST to be less
3831 than 2^(63). For Ò/4 < �ST� < 2^(63), FPTAN (as well as the other
3832 trigonometric instructions) performs an internal reduction of ST to a value
3833 less than Ò/4 using an internally stored Ò/4 divisor that has 67 significant
3834 bits. Because of its greater accuracy, this method of reduction is
3835 recommended when the argument is within the required range.
3837 However, when �ST� � 2^(63), FPREM can be employed to reduce ST. With Ò/4 as
3838 a modulus, FPREM can reduce an argument so that it is within range of FPTAN
3839 and so that no further reduction is required by FPTAN.
3841 Because FPREM produces an exact result, the argument reduction does not
3842 introduce roundoff error into the calculation, even if several iterations
3843 are required to bring the argument into range. However, Ò is never accurate.
3844 The rounding of Ò, when it is used by FPREM to reduce an argument for a
3845 periodic trigonometric function, does not create the effect of a rounded
3846 argument, but of a rounded period.
3848 When reduction is complete, FPREM provides the least-significant three bits
3849 of the quotient generated by FPREM (in C{3}, C{1}, C{0}). This is also
3850 important for transcendental argument reduction, because it locates the
3851 original angle in the correct one of eight Ò/4 segments of the unit circle
3855 Table 4-4. Condition Code Interpretation after FPREM and FPREM1
3858 ’‘‘ Condition Code ‘‘“ Interpretation after
3859 C2(PF) C3 C1 C0 FPREM and FPREM1
3860 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
3861 Incomplete Reduction:
3862 1 X X X ‘‘‘
\x10 further interation required
3863 or complete reduction
3868 1 0 0 2 � Complete Reduction:
3869 0 1 1 0 3 –‘
\x10 C0, C3, C1 contain three least
3870 0 0 1 4 � significant bits of quotient
3876 4.4.10 FPREM1‘‘Partial Remainder (IEEE Std. 754-Compatible)
3878 FPREM1 computes the remainder of division of ST by ST(1) and leaves the
3879 result in ST. FPREM1 finds a remainder REM1 and a quotient Q1 such that
3881 REM1 = ST - ST(1)*Q1
3883 The quotient Q1 is chosen to be the integer nearest to the exact value of
3884 ST/ST(1). When ST/ST(1) is exactly N + 1/2 (for some integer N), there are
3885 two integers equally close to ST/ST(1). In this case the value chosen for Q1
3886 is the even integer.
3888 The result produced by FPREM1 is always exact; no rounding is necessary,
3889 and therefore the precision exception does not occur and the rounding
3890 control has no effect.
3892 The FPREM1 instruction is designed to be executed iteratively in a
3893 software-controlled loop. FPREM1 operates by performing successive scaled
3894 subtractions; therefore, obtaining the exact remainder when the operands
3895 differ greatly in magnitude can consume large amounts of execution time.
3896 Because the 80387 can only be preempted between instructions, the remainder
3897 function could seriously increase interrupt latency in these cases. For
3898 this reason, the maximum number of iterations is limited. The instruction
3899 may terminate before it has completely terminated the calculation. The C2
3900 bit of the status word indicates whether the calculation is complete or
3901 whether the instruction must be executed again.
3903 FPREM1 can reduce the exponent of ST by up to (but not including) 64 in one
3904 execution. If FPREM1 produces a remainder that is less than the modulus
3905 (i.e., the divisor), the function is complete and bit C2 of the status word
3906 condition code is cleared. If the function is incomplete, C2 is set to 1;
3907 the result in ST is then called the partial remainder. Software can inspect
3908 C2 by storing the status word following execution of FPREM1, reexecuting
3909 the instruction (using the partial remainder in ST as the dividend) until C2
3910 is cleared. When C2 is cleared, FPREM1 also provides the least-significant
3911 three bits of the quotient generated by FPREM1 (in C{3}, C{1}, C{0}).
3913 The uses for FPREM1 are the same as those for FPREM.
3915 FPREM1 differs from FPREM it these respects:
3917 Ž FPREM and FPREM1 choose the value of the quotient differently; the
3918 low-order three bits of the quotient as reported in bits C3, C1, C0 of
3919 the status word may differ by one in some cases.
3921 Ž FPREM and FPREM1 may produce different remainders. FPREM produces a
3922 remainder R such that 0 ¾ R < �ST(1)� or -�ST(1)� < R ¾ 0, depending
3923 on the sign of the dividend. FPREM1 produces a remainder R1 such that
3924 -�ST(1)�/2 < R1 < +�ST(1)�/2.
3929 FRNDINT (round to integer) rounds the top stack element to an integer
3930 according to the RC bits of the control word. For example, assume that ST
3931 contains the 80387 real number encoding of the decimal value 155.625.
3932 FRNDINT will change the value to 155 if the RC field of the control word is
3933 set to down or chop, or to 156 if it is set to up or nearest.
3938 FXTRACT (extract exponent and significand) performs a superset of the
3939 IEEE-recommended logb(x) function by "decomposing" the number in the stack
3940 top into two numbers that represent the actual value of the operand's
3941 exponent and significand fields. The "exponent" replaces the original
3942 operand on the stack and the "significand" is pushed onto the stack. (ST(7)
3943 must be empty to avoid causing the invalid-operation exception.) Following
3944 execution of FXTRACT, ST (the new stack top) contains the value of the
3945 original significand expressed as a real number: its sign is the same as the
3946 operand's, its exponent is 0 true (16,383 or 3FFFH biased), and its
3947 significand is identical to the original operand's. ST(1) contains the value
3948 of the original operand's true (unbiased) exponent expressed as a real
3951 If the original operand is zero, FXTRACT leaves -ý in ST(1) (the exponent)
3952 while ST is assigned the value zero with a sign equal to that of the
3953 original operand. The zero-divide exception is raised in this case, as well.
3955 To illustrate the operation of FXTRACT, assume that ST contains a number
3956 whose true exponent is +4 (i.e., its exponent field contains 4003H). After
3957 executing FXTRACT, ST(1) will contain the real number +4.0; its sign will be
3958 positive, its exponent field will contain 4001H (+2 true) and its
3959 significand field will contain 1{
\x1e}00...00B. In other words, the value in
3960 ST(1) will be 1.0 * 2² = 4. If ST contains an operand whose true exponent
3961 is -7 (i.e., its exponent field contains 3FF8H), then FXTRACT will return an
3962 "exponent" of -7.0; after the instruction executes, ST(1)'s sign and
3963 exponent fields will contain C001H (negative sign, true exponent of 2), and
3964 its significand will be 1{
\x1e}1100...00B. In other words, the value in ST(1)
3965 will be -1.75 * 2² = -7.0. In both cases, following FXTRACT, ST's sign and
3966 significand fields will be the same as the original operand's, and its
3967 exponent field will contain 3FFFH (0 true).
3969 FXTRACT is useful for power and range scaling operations. Both FXTRACT and
3970 the base 2 exponential instruction F2XM1 are needed to perform a general
3971 power operation. Converting numbers in 80387 extended real format to decimal
3972 representations (e.g., for printing or displaying) requires not only FBSTP
3973 but also FXTRACT to allow scaling that does not overflow the range of the
3974 extended format. FXTRACT can also be useful for debugging, because it allows
3975 the exponent and significand parts of a real number to be examined
3981 FABS (absolute value) changes the top stack element to its absolute value
3982 by making its sign positive. Note that the invalid-operation exception is
3983 not signaled even if the operand is a signaling NaN or has a format that is
3989 FCHS (change sign) complements (reverses) the sign of the top stack
3990 element. Note that the invalid-operation exception is not signaled even if
3991 the operand is a signaling NaN or has a format that is not supported.
3994 4.5 Comparison Instructions
3996 The instructions of this class allow comparison of numbers of all supported
3997 real and integer data types. Each of these instructions (Table 4-5)
3998 analyzes the top stack element, often in relationship to another operand,
3999 and reports the result as a condition code in the status word.
4001 The basic operations are compare, test (compare with zero), and examine
4002 (report type, sign, and normalization). Special forms of the compare
4003 operation are provided to optimize algorithms by allowing direct comparisons
4004 with binary integers and real numbers in memory, as well as popping the
4005 stack after a comparison.
4007 The FSTSW (store status word) instruction may be used following a
4008 comparison to transfer the condition code to memory or to the 80386 AX
4009 register for inspection. The 80386 SAHF instruction is recommended for
4010 copying the 80387 flags from AX to the 80386 flags for easy conditional
4013 Note that instructions other than those in the comparison group may update
4014 the condition code. To ensure that the status word is not altered
4015 inadvertently, store it immediately following a comparison operation.
4018 Table 4-5. Comparison Instructions
4021 FCOMP Compare real and pop
4022 FCOMPP Compare real and pop twice
4023 FICOM Integer compare
4024 FICOMP Integer compare and pop
4026 FUCOM Unordered compare real
4027 FUCOMP Unordered compare real and pop
4028 FUCOMPP Unordered compare real and pop twice
4034 FCOM (compare real) compares the stack top to the source operand. The
4035 source operand may be a register on the stack, or a single or double real
4036 memory operand. If an operand is not coded, ST is compared to ST(1). The
4037 sign of zero is ignored, so that +0 = -0. Following the instruction, the
4038 condition codes reflect the order of the operands as shown in Table 4-6.
4040 If either operand is a NaN (either quiet or signaling) or an undefined
4041 format, or if a stack fault occurs, the invalid-operation exception is
4042 raised and the condition bits are set to "unordered."
4045 Table 4-6. Condition Code Resulting from Comparisons
4048 Order C3 (ZF) C2 (PF) C0 (CF) Conditional
4051 ST > Operand 0 0 0 JA
4052 ST < Operand 0 0 1 JB
4053 ST = Operand 1 0 0 JE
4057 4.5.2 FCOMP //source
4059 FCOMP (compare real and pop) operates like FCOM, and in addition pops the
4065 FCOMPP (compare real and pop twice) operates like FCOM and additionally
4066 pops the stack twice, discarding both operands. FCOMPP always compares ST to
4067 ST(1); no operands may be explicitly specified.
4072 FICOM (integer compare) converts the source operand, which may reference a
4073 word or short binary integer variable, to extended real and compares the
4074 stack top to it. The condition code bits in the status word are set as for
4080 FICOMP (integer compare and pop) operates identically to FICOM and
4081 additionally discards the value in ST by popping the NPX stack.
4086 FTST (test) tests the top stack element by comparing it to zero. The result
4087 is posted to the condition codes as shown in Table 4-7.
4090 Table 4-7. Condition Code Resulting from FTST
4093 Order C3 (ZF) C2 (ZF) C0 (ZF) Conditional
4102 4.5.7 FUCOM //source
4104 FUCOM (unordered compare real) operates like FCOM, with two differences:
4106 1. It does not cause an invalid-operation exception when one of the
4107 operands is a NaN. If either operand is a NaN, the condition bits of
4108 the status word are set to unordered as shown in Table 4-6.
4110 2. Only operands on the NPX stack can be compared.
4113 4.5.8 FUCOMP //source
4115 FUCOMP (unordered compare real and pop) operates like FUCOM and in addition
4121 FUCOMPP (unordered compare real and pop) operates like FUCOM and in
4122 addition pops the NPX stack twice, discarding both operands. FUCOMPP always
4123 compares ST to ST(1); no operands can be explicitly specified.
4128 FXAM (examine) reports the content of the top stack element as
4129 positive/negative and NaN, denormal, normal, zero, infinity, unsupported, or
4130 empty. Table 4-8 lists and interprets all the condition code values that
4134 4.6 Transcendental Instructions
4136 The instructions in this group (Table 4-9) perform the time-consuming core
4137 calculations for all common trigonometric, inverse trigonometric,
4138 hyperbolic, inverse hyperbolic, logarithmic, and exponential functions. The
4139 transcendentals operate on the top one or two stack elements, and they
4140 return their results to the stack. The trigonometric operations assume their
4141 arguments are expressed in radians. The logarithmic and exponential
4142 operations work in base 2.
4144 The results of transcendental instructions are highly accurate. The
4145 absolute value of the relative error of the transcendental instructions is
4146 guaranteed to be less than 2^(-62). (Relative error is the ratio between the
4147 absolute error and the exact value.)
4149 The trigonometric functions accept a practically unrestricted range of
4150 operands, whereas the other transcendental instructions require that
4151 arguments be more restricted in range. FPREM or FPREM1 may be used to bring
4152 the otherwise valid operand of a periodic function into range. Prologue and
4153 epilogue software may be used to reduce arguments for other instructions to
4154 the expected range and to adjust the result to correspond to the original
4155 arguments if necessary. The instruction descriptions in this section
4156 document the allowed operand range for each instruction.
4159 Table 4-8. Condition Code Defining Operand Class
4161 C3 C2 C1 C0 Value at TOP
4163 0 0 0 0 +Unsupported
4165 0 0 1 0 -Unsupported
4179 Table 4-9. Transcendental Instructions
4183 FSINCOS Sine and cosine
4185 FPATAN Arctangent of ST(1)/ST
4187 FYL2X Y * log{2}X; Y is ST(1), X is ST
4188 FYL2XP1 Y * log{2}(X + 1); Y is ST(1), X is ST
4193 When complete, this function replaces the contents of ST with COS(ST). ST,
4194 expressed in radians, must lie in the range �Ú� < 2^(63) (for most practical
4195 purposes unrestricted). If ST is in range, C2 of the status word is cleared
4196 and the result of the operation is produced.
4198 If the operand is outside of the range, C2 is set to one (function
4199 incomplete) and ST remains intact (i.e., no reduction of the operand is
4200 performed). It is the programmers responsibility to reduce the operand to an
4201 absolute value smaller than 2^(63). The instructions FPREM1 and FPREM are
4202 available for this purpose.
4207 When complete, this function replaces the contents of ST with SIN(ST). FSIN
4208 is equivalent to FCOS in the way it reduces the operand. ST is expressed in
4214 When complete, this instruction replaces the contents of ST with SIN(ST),
4215 then pushes COS(ST) onto the stack. (ST(7) must be empty to avoid an invalid
4216 exception.) FSINCOS is equivalent to FCOS in the way it reduces the operand.
4217 ST is expressed in radians.
4222 When complete, FPTAN (partial tangent) computes the function Y = TAN (ST).
4223 ST is expressed in radians. Y replaces ST, then the value 1 is pushed,
4224 becoming the new stack top. (ST(7) must be empty to avoid an invalid
4225 exception.) When the function is complete ST(1) = TAN (arg) and ST = 1.
4226 FPTAN is equivalent to FCOS in the way it reduces the operand.
4228 The fact that FPTAN places two results on the stack maintains compatibility
4229 with the 8087/80287 and aids the calculation of other trigonometric
4230 functions that can be derived from tan via standard trigonometric
4231 identities. For example, the cot function is given by this identity:
4235 Therefore, simply executing the reverse divide instruction FDIVR after
4236 FPTAN yields the cot function.
4241 FPATAN (arctangent) computes the function Ú = ARCTAN (Y/X). X is taken from
4242 ST(0) and Y from ST(1). The instruction pops the NPX stack and returns Ú to
4243 the (new) stack top, overwriting the Y operand. The result is expressed in
4244 radians. The range of operands is not restricted; however, the range of the
4245 result depends on the relationship between the operands according to Table
4248 The fact that the argument of FPATAN is a ratio aids calculation of other
4249 trigonometric functions, including Arcsin and Arccos. These can be derived
4250 from Arctan via standard trigonometric identities. For example, the Arcsin
4251 function can be easily calculated using this identity:
4253 Arcsin x = Arctan (x / ¹(1 - x²)).
4255 Thus, to find Arcsin (Y), push Y onto the NPX stack, then calculate
4256 X = ¹(1 - Y²), pushing the result X onto the stack. Executing FPATAN then
4257 leaves Arcsin (Y) at the top of the stack.
4262 F2XM1 (2 to the X minus 1) calculates the function Y = 2^(X) - 1. X is taken
4263 from the stack top and must be in the range -1 ¾ X ¾ 1. The result Y
4264 replaces the argument X at the stack top. If the argument is out of range,
4265 the results are undefined.
4267 This instruction is designed to produce a very accurate result even when X
4268 is close to 0. For values of the argument very close in magnitude to 1, a
4269 larger error will be incurred. To obtain Y = 2^(X), add 1 to the result
4272 The following formulas show how values other than 2 may be raised to a
4275 10^(X) = 2^(X * LOG2(10))
4277 e^(X) = 2^(X * LOG2(e))
4279 y^(X) = 2^(X * LOG2(Y))
4281 As shown in the next section, the 80387 has built-in instructions for
4282 loading the constants LOG{2}10 and LOG{2}e, and the FYL2X instruction may be
4283 used to calculate X*LOG{2}Y.
4286 Table 4-10. Results of FPATAN
4288 Sign(Y) Sign(X) �Y� < �X�? Final Result
4290 + + Yes 0 < atan(Y/X) < Ò/4
4291 + + No Ò/4 < atan(Y/X) < Ò/2
4292 + - No Ò/2 < atan(Y/X) < 3 * Ò/4
4293 + - Yes 3 * Ò/4 < atan(Y/X) < Ò
4294 - + Yes -Ò/4 < atan(Y/X) < 0
4295 - + No -Ò/2 < atan(Y/X) < -Ò/4
4296 - - No -3 * Ò/4 < atan(Y/X) < -Ò/2
4297 - - Yes -Ò < atan(Y/X) < -3 * Ò/4
4302 FYL2X (Y log base 2 of X) calculates the function Z = Y * LOG{2}X. X is
4303 taken from the stack top and Y from ST(1). The operands must be in the
4309 The instruction pops the NPX stack and returns Z at the (new) stack top,
4310 replacing the Y operand. If the operand is out of range (i.e., in negative)
4311 the invalid-operation exception occurs.
4313 This function optimizes the calculations of log to any base other than two,
4314 because a multiplication is always required:
4316 LOG{N}x = (LOG{2}N){-1} * LOG{2}x
4321 FYL2XP1 (Y log base 2 of (X + 1)) calculates the function Z = Y*LOG{2}
4322 (X+1). X is taken from the stack top and must be in the range -(1-SQRT(2)/2)
4323 < X <1-SQRT(2)/2. Y is taken from ST(1) and is unlimited in range (-ý < Y
4324 < +ý). FYL2XP1 pops the stack and returns Z at the (new) stack top,
4325 replacing Y. If the argument is out of range, the results are undefined.
4327 This instruction provides improved accuracy over FYL2X when computing the
4328 logarithm of a number very close to 1, for example 1 + ¯ where ¯ << 1.
4329 Providing ¯ rather than 1 + ¯ as the input to the function allows more
4330 significant digits to be retained.
4333 Table 4-11. Constant Instructions
4338 FLDL2T Load log{2}10
4340 FLDLG2 Load log{10}2
4344 4.7 Constant Instructions
4346 Each of these instructions (Table 4-11) loads (pushes) a commonly used
4347 constant onto the stack. (ST(7) must be empty to avoid an invalid
4348 exception.) The values have full extended real precision (64 bits) and are
4349 accurate to approximately 19 decimal digits. Because an external real
4350 constant occupies 10 memory bytes, the constant instructions, which are
4351 only two bytes long, save storage and improve execution speed, in addition
4352 to simplifying programming.
4354 The constants used by these instructions are stored internally in a format
4355 more precise even than extended real. When loading the constant, the 80387
4356 rounds the more precise internal constant according the RC (rounding
4357 control) bit of the control word. However, in spite of this rounding, the
4358 precision exception is not raised (to maintain compatibility). When the
4359 rounding control is set to round to nearest on the 80387, the 80387
4360 produces the same constant that is produced by the 80287.
4365 FLDZ (load zero) loads (pushes) +0.0 onto the NPX stack.
4370 FLD1 (load one) loads (pushes) +1.0 onto the NPX stack.
4375 FLDPI (load Ò) loads (pushes) Ò onto the NPX stack.
4380 FLDL2T (load log base 2 of 10) loads (pushes) the value LOG{2}10 onto the
4386 FLDL2E (load log base 2 of e) loads (pushes) the value LOG{2}e onto the NPX
4392 FLDLG2 (load log base 10 of 2) loads (pushes) the value LOG{10}2 onto the
4398 FLDLN2 (load log base e of 2) loads (pushes) the value LOG{e}2 onto the NPX
4402 4.8 Processor Control Instructions
4404 The processor control instructions are shown in Table 4-12. The instruction
4405 FSTSW is commonly used for conditional branching. The remaining instructions
4406 are not typically used in calculations; they provide control over the 80387
4407 NPX for system-level activities. These activities include initialization,
4408 exception handling, and task switching.
4410 As shown in Table 4-12, many of the NPX processor control instructions have
4411 two forms of assembler mnemonic:
4413 1. A wait form, where the mnemonic is prefixed only with an F, such as
4414 FSTSW. This form checks for unmasked numeric exceptions.
4416 2. A no-wait form, where the mnemonic is prefixed with an FN, such as
4417 FNSTSW. This form ignores unmasked numeric exceptions.
4419 When the control instruction is coded using the no-wait form of the
4420 mnemonic, the ASM386 assembler does not precede the ESC instruction with a
4421 wait instruction, and the CPU does not test the ERROR# status line from the
4422 NPX before executing the processor control instruction.
4424 Only the processor control class of instructions have this alternate
4425 no-wait form. All numeric instructions are automatically synchronized by the
4426 80386; the CPU transfers all operands before initiating the next
4427 instruction. Because of this automatic synchronization by the 80386, numeric
4428 instructions for the 80387 need not be preceded by a CPU wait instruction
4429 in order to execute correctly.
4431 It should also be noted that the 8087 instructions FENI and FDISI and the
4432 80287 instruction FSETPF perform no function in the 80387. If these opcodes
4433 are detected in an 80386/80387 instruction stream, the 80387 performs no
4434 specific operation and no internal states are affected. For programmers
4435 interested in porting numeric software from 80287 or 8087 environments to
4436 the 80386, however, it should be noted that program sections containing
4437 these exception-handling instructions are not likely to be completely
4438 portable to the 80387. Appendix C and Appendix D contains a more complete
4439 description of the differences between the 80387 and the 80287/8087.
4442 Table 4-12. Processor Control Instructions
4444 FINIT/FNINIT Initialize processor
4445 FLDCW Load control word
4446 FSTCW/FNSTCW Store control word
4447 FSTSW/FNSTSW Store status word
4448 FSTSW AX/FNSTSW AX Store status word to AX
4449 FCLEX/FNCLEX Clear exceptions
4450 FSTENV/FNSTENV Store environment
4451 FLDENV Load environment
4452 FSAVE/FNSAVE Save state
4453 FRSTOR Restore state
4454 FINCSTP Increment stack pointer
4455 FDECSTP Decrement stack pointer
4463 FINIT/FNINIT (initialize processor) sets the 80387 NPX into a known state,
4464 unaffected by any previous activity. It sets the control word to its default
4465 value 037FH (round to nearest, all exceptions masked, 64 bits of precision),
4466 clears the status word, and empties all floating-point stack registers. The
4467 no-wait form of this instruction causes the 80387 to abort any previous
4468 numeric operations currently executing in the NEU.
4470 This instruction performs the functional equivalent of a hardware RESET,
4471 with one exception: RESET causes the IM bit of the control word to be reset
4472 and the ES and IE bits of the status word to be set as a means of signaling
4473 the presence of an 80387; FINIT puts the opposite values in these bits.
4475 FINIT checks for unmasked numeric exceptions, FNINIT does not. Note that if
4476 FNINIT is executed while a previous 80387 memory-referencing instruction is
4477 running, 80387 bus cycles in progress are aborted. This instruction may be
4478 necessary to clear the 80387 if a processor-extension segment-overrun
4479 exception (interrupt 9) is detected by the CPU.
4484 FLDCW (load control word) replaces the current processor control word with
4485 the word defined by the source operand. This instruction is typically used
4486 to establish or change the 80387's mode of operation. Note that if an
4487 exception bit in the status word is set, loading a new control word that
4488 unmasks that exception will activate the ERROR# output of the 80387. When
4489 changing modes, the recommended procedure is to first clear any exceptions
4490 and then load the new control word.
4493 4.8.3 FSTCW/FNSTCW destination
4495 FSTCW/FNSTCW (store control word) writes the processor control word to the
4496 memory location defined by the destination. FSTCW checks for unmasked
4497 numeric exceptions; FNSTCW does not.
4500 4.8.4 FSTSW/FNSTSW destination
4502 FSTSW/FNSTSW (store status word) writes the current value of the 80387
4503 status word to the destination operand in memory. The instruction is used to
4505 Ž Implement conditional branching following a comparison, FPREM, or
4506 FPREM1 instruction (FSTSW).
4508 Ž Invoke exception handlers (by polling the exception bits) in
4509 environments that do not use interrupts (FSTSW).
4511 FSTSW checks for unmasked numeric exceptions, FNSTSW does not.
4514 4.8.5 FSTSW AX/FNSTSW AX
4516 FSTSW AX/FNSTSW AX (store status word to AX) is a special 80387 instruction
4517 that writes the current value of the 80387 status word directly into the
4518 80386 AX register. This instruction optimizes conditional branching in
4519 numeric programs, where the 80386 CPU must test the condition of various NPX
4520 status bits. The waited form FSTSW AX checks for unmasked numeric
4521 exceptions, the non-waited form FNSTSW AX does not.
4523 When this instruction is executed, the 80386 AX register is updated with
4524 the NPX status word before the CPU executes any further instructions. The
4525 status stored is that from the completion of the prior ESC instruction.
4530 FCLEX/FNCLEX (clear exceptions) clears all exception flags, the exception
4531 status flag and the busy flag in the status word. As a consequence, the
4532 80387's ERROR# line goes inactive. FCLEX checks for unmasked numeric
4533 exceptions, FNCLEX does not.
4536 4.8.7 FSAVE/FNSAVE destination
4538 FSAVE/FNSAVE (save state) writes the full 80387 state‘‘environment plus
4539 register stack‘‘to the memory location defined by the destination operand.
4540 Figure 4-1 and Figure 4-2 show the layout of the save area; the size and
4541 layout of the save the operating mode of the 80386 (real-address mode or
4542 protected mode) and on the operand-size attribute in effect for the
4543 instruction (32-bit operand or 16-bit operand). When the 80386 is in
4544 virtual-8086 mode, the real-address mode formats are used. Typically the
4545 instruction is coded to save this image on the CPU stack.
4547 The values in the tag word in memory are determined during the execution of
4548 FSAVE/FNSAVE. If the tag in the status register indicates that the
4549 corresponding register is nonempty, the 80387 examines the data in the
4550 register and stores the appropriate tag in memory. Thus the tag that is
4551 stored always reflects the actual content of the register.
4553 FNSAVE delays its execution until all NPX activity completes normally.
4554 Thus, the save image reflects the state of the NPX following the completion
4555 of any running instruction. After writing the state image to memory,
4556 FSAVE/FNSAVE initializes the 80387 as if FINIT/FNINIT had been executed.
4558 FSAVE/FNSAVE is useful whenever a program wants to save the current state
4559 of the NPX and initialize it for a new routine. Three examples are
4561 1. An operating system needs to perform a context switch (suspend the
4562 task that had been running and give control to a new task).
4564 2. An exception handler needs to use the 80387.
4566 3. An application task wants to pass a "clean" 80387 to a subroutine.
4568 FSAVE checks for unmasked numeric exceptions before executing, FNSAVE does
4572 Figure 4-1. FSAVE/FRSTOR Memory Layout (32-Bit)
4575 ‚�����������Ï�����������Ï�����������Ï�����������ƒ+0H
4576 Ñ‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+4H
4577 Ñ‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘ ‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+8H
4578 Ñ‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘ ENVIRONMENT ‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+CH
4579 Ñ‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘ ‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+10H
4580 Ñ‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+14H
4581 Ñ‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+18H
4582 „�����������Ï�����������Ï�����������Ï�����������…
4584 ‚����Ð��������Ð����������������������������������������������ƒ
4585 ST(0)€SIGN�EXPONENT� SIGNIFICAND €+1CH
4586 ST(1)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+26H
4587 ST(2)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+30H
4588 ST(3)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+3AH
4589 ST(4)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+44H
4590 ST(5)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+4EH
4591 ST(6)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+58H
4592 ST(7)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+62H
4593 „����¤��������¤����������������������������������������������…
4597 Figure 4-2. FSAVE/FRSTOR Memory Layout (16-Bit)
4600 ‚����������Ï����������ƒ+0H
4601 Ñ‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘Â+2H
4602 Ñ‘‘‘‘‘‘ ‘‘‘‘‘‘‘Â+4H
4603 Ñ‘‘‘ ENVIRONMENT ‘‘‘‘Â+6H
4604 Ñ‘‘‘‘‘‘ ‘‘‘‘‘‘‘Â+8H
4605 Ñ‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘Â+AH
4606 Ñ‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘Â+CH
4607 „����������Ï����������…
4609 ‚����Ð��������Ð����������������������������������������������ƒ
4610 ST(0)€SIGN�EXPONENT� SIGNIFICAND €+EH
4611 ST(1)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+18H
4612 ST(2)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+22H
4613 ST(3)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+2CH
4614 ST(4)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+36H
4615 ST(5)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+40H
4616 ST(6)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+4AH
4617 ST(7)Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+54H
4618 „����¤��������¤����������������������������������������������…
4624 FRSTOR (restore state) reloads the 80387 state from the memory area defined
4625 by the source operand. This information should have been written by a
4626 previous FSAVE/FNSAVE instruction and not altered by any other instruction.
4627 FRSTOR automatically waits checking for interrupts until all data transfers
4628 are completed before continuing to the next instruction.
4630 Note that the 80387 "reacts" to its new state at the conclusion of the
4631 FRSTOR. It generates an exception request, for example, if the exception and
4632 mask bits in the memory image so indicate when the next WAIT or
4633 exception-checking ESC instruction is executed.
4636 4.8.9 FSTENV/FNSTENV destination
4638 FSTENV/FNSTENV (store environment) writes the 80387's basic
4639 status‘‘control, status, and tag words, and exception pointers‘‘to the
4640 memory location defined by the destination operand. Typically, the
4641 environment is saved on the CPU stack. FSTENV/FNSTENV is often used by
4642 exception handlers because it provides access to the exception pointers
4643 that identify the offending instruction and operand. After saving the
4644 environment, FSTENV/FNSTENV sets all exception masks in the 80387 control
4645 word (i.e., masks all exceptions). FSTENV checks for pending exceptions
4646 before executing, FNSTENV does not.
4648 Figures 4-3 through 4-6 show the format of the environment data in memory;
4649 the size and layout of the save area depends on the operating mode of the
4650 80386 (real-address mode or protected mode) and on the operand-size
4651 attribute in effect for the instruction (32-bit operand or 16-bit operand).
4652 When the 80386 is in virtual-8086 mode, the real-address mode formats are
4653 used. FNSTENV does not store the environment until all NPX activity has
4654 completed. Thus, the data saved by the instruction reflects the 80387 after
4655 any previously decoded instruction has been executed.
4657 The values in the tag word in memory are determined during the execution of
4658 FNSTENV/FSTENV. If the tag in the status register indicates that the
4659 corresponding register is nonempty, the 80387 examines the data in the
4660 register and stores the appropriate tag in memory. Thus the tag that is
4661 stored always reflects the actual content of the register.
4664 Figure 4-3. Protected Mode 80387 Environment, 32-Bit Format
4666 32-BIT PROTECTED MODE FORMAT
4669 ‚�����������������Ï�����������������Ð�����������������Ï�����������������ƒ
4670 € RESERVED � CONTROL WORD €0H
4671 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4672 € RESERVED � STATUS WORD €4H
4673 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4674 € RESERVED � TAG WORD €8H
4675 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4677 Ñ‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4678 € 0 0 0 0 0� OPCODE 10..0 � CS SELECTOR €10H
4679 Ñ‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4680 € DATA OPERAND OFFSET €14H
4681 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4682 € RESERVED � OPERAND SELECTOR €18H
4683 „�����������������Ï�����������������¤�����������������Ï�����������������…
4686 Figure 4-4. Real Mode 80387 Environment, 32-Bit Format
4688 32-BIT PROTECTED MODE FORMAT
4691 ‚�����������������Ï�����������������Ð�����������������Ï�����������������ƒ
4692 € RESERVED � CONTROL WORD €0H
4693 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4694 € RESERVED � STATUS WORD €4H
4695 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4696 € RESERVED � TAG WORD €8H
4697 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4698 € RESERVED � INSTRUCTION POINTER 15..0 €CH
4699 Ñ‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘˜‘˜‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4700 € 0 0 0 0 � INSTRUCTION POINTER 31..16 �0� OPCODE 10..0 €10H
4701 Ñ‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘™‘™‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4702 € RESERVED � OPERAND POINTER 15..0 €14H
4703 Ñ‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4704 € 0 0 0 0 � OPERAND POINTER 31..16 �0 0 0 0 0 0 0 0 0 0 0 0€18H
4705 „���������¤�������Ï�����������������¤�����������¤�����Ï�����������������…
4708 Figure 4-5. Protected Mode 80387 Environment, 16-Bit Format
4710 16-BIT PROTECTED MODE FORMAT
4713 ‚����������������Ï����������������ƒ
4715 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4717 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4719 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4721 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4723 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4724 € OPERAND OFFSET € AH
4725 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4726 € OPERAND SELECTOR € CH
4727 „����������������Ï����������������…
4730 Figure 4-6. Real Mode 80387 Environment, 16-Bit Format
4732 16-BIT REAL-ADDRESS MODE
4733 AND VIRTUAL-8086 MODE FORMAT
4736 ‚����������������Ï����������������ƒ
4738 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4740 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4742 Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4743 € INSTRUCTION POINTER 15..0 € 6H
4744 Ñ‘‘‘‘‘‘‘‘˜‘˜‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4745 €IP 19..16�0� OPCODE 10..0 € 8H
4746 Ñ‘‘‘‘‘‘‘‘™‘™‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4747 € OPERAND POINTER 15..0 € AH
4748 Ñ‘‘‘‘‘‘‘‘˜‘˜‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4749 €OP 19..16�0�0 0 0 0 0 0 0 0 0 0 0€ CH
4750 „���������¤�¤����Ï����������������…
4753 4.8.10 FLDENV source
4755 FLDENV (load environment) reloads the environment from the memory area
4756 defined by the source operand. This data should have been written by a
4757 previous FSTENV/FNSTENV instruction. CPU instructions (that do not reference
4758 the environment image) may immediately follow FLDENV. FLDENV automatically
4759 waits for all data transfers to complete before executing the next
4762 Note that loading an environment image that contains an unmasked exception
4763 causes a numeric exception when the next WAIT or exception-checking ESC
4764 instruction is executed.
4769 FINCSTP (increment NPX stack pointer) adds 1 to the stack top pointer (TOP)
4770 in the status word. It does not alter tags or register contents, nor does it
4771 transfer data. It is not equivalent to popping the stack, because it does
4772 not set the tag of the previous stack top to empty. Incrementing the stack
4773 pointer when ST=7 produces ST=0.
4778 FDECSTP (decrement NPX stack pointer) subtracts 1 from ST, the stack top
4779 pointer in the status word. No tags or registers are altered, nor is any
4780 data transferred. Executing FDECSTP when ST=0 produces ST=7.
4783 4.8.13 FFREE destination
4785 FFREE (free register) changes the destination register's tag to empty; the
4786 content of the register is unaffected.
4791 FNOP (no operation) effectively performs no operation.
4794 4.8.15 FWAIT (CPU Instruction)
4796 FWAIT is not actually an 80387 instruction, but an alternate mnemonic for
4797 the 80386 WAIT instruction. The FWAIT or WAIT mnemonic should be coded
4798 whenever the programmer wants to check for a pending error before modifying
4799 a variable used in the previous floating-point instruction. Coding an FWAIT
4800 instruction after an 80387 instruction ensures that unmasked numeric
4801 exceptions occur and exception handlers are invoked before the next
4802 instruction has a chance to examine the results of the 80387 instruction.
4804 More information on when to code an FWAIT instruction is given in Chapter 5
4805 in the section "Concurrent Processing with the 80387."
4809 Chapter 5 Programming Numeric Applications
4811 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
4813 5.1 Programming Facilities
4815 As described previously, the 80387 NPX is programmed simply as an extension
4816 of the 80386 CPU. This section describes how programmers in ASM386 and in a
4817 variety of higher-level languages can work with the 80387.
4819 The level of detail in this section is intended to give programmers a basic
4820 understanding of the software tools that can be used with the 80387, but
4821 this information does not document the full capabilities of these
4822 facilities. Complete documentation is available with each program
4823 development product.
4826 5.1.1 High-Level Languages
4828 For programmers using high-level languages, the programming and operation
4829 of the NPX is handled automatically by the compiler. A variety of Intel
4830 high-level languages are available that automatically make use of the 80387
4831 NPX when appropriate. These languages include C-386 and PL/M-386. In
4832 addition many high-level language compilers are available from independent
4835 Each of these high-level languages has special numeric libraries allowing
4836 programs to take advantage of the capabilities of the 80387 NPX. No special
4837 programming conventions are necessary to make use of the 80387 NPX when
4838 programming numeric applications in any of these languages.
4840 Programmers in PL/M-386 and ASM386 can also make use of many of these
4841 library routines by using routines contained in the 80387 Support Library.
4842 These libraries implement many of the functions provided by higher-level
4843 languages, including exception handlers, ASCII-to-floating-point
4844 conversions, and a more complete set of transcendental functions than that
4845 provided by the 80387 instruction set.
4850 C programmers automatically cause the C compiler to generate 80387
4851 instructions when they use the double and float data types. The float type
4852 corresponds to the 80387's single real format; the double type corresponds
4853 to the 80387's double real format. The statement #include <math.h> causes
4854 mathematical functions such as sin and sqrt to return values of type
4855 double. Figure 5-1 illustrates the ease with which C programs interface
4859 Figure 5-1. Sample C-386 Program
4861 XENIX286 C386 COMPILER, V0.2 COMPILATION OF MODULE SAMPLE
4862 OBJECT MODULE PLACED IN sample.obj
4863 COMPILER INVOKED BY: c386 sample.c
4867 1 /******************************************************
4869 3 * SAMPLE C PROGRAM *
4871 5 ******************************************************/
4873 7 /** Include /usr/include/stdio.h if necessary **/
4874 8 /** Include math declarations for transcendenatals and others **/
4876 10 #include </usr/include/math.h>
4877 36 #define PI 3.141592654
4881 40 1 double sin_result, cos_result;
4882 41 1 double angle_deg = 0.0, angle_rad;
4883 42 1 int i, no_of_trial = 4;
4885 44 1 for( i = 1; i <= no_of_trial; i++){
4886 45 2 angle_rad = angle_deg * PI / 180.0;
4887 46 2 sin_result = sin (angle_rad);
4888 47 2 cos_result = cos (angle_rad);
4889 48 2 printf("sine of %f degrees equals %f\n", angle_deg, sin_result);
4890 49 2 printf("cosine of %f degrees equals %f\n\n", angle_deg, cos_result);
4891 50 2 angle_deg = angle_deg + 30.0;
4896 C386 COMPILATION COMPLETE. 0 WARNINGS, 0 ERRORS
4901 Programmers in PL/M-386 can access a very useful subset of the 80387's
4902 numeric capabilities. The PL/M-386 REAL data type corresponds to the NPX's
4903 single real (32-bit) format. This data type provides a range of about
4904 8.43 * 10^(-37) ¾ �X� ¾ 3.38 * 10^(38), with about seven significant decimal
4905 digits. This representation is adequate for the data manipulated by many
4906 microcomputer applications.
4908 The utility of the REAL data type is extended by the PL/M-386 compiler's
4909 practice of holding intermediate results in the 80387's extended real
4910 format. This means that the full range and precision of the processor are
4911 utilized for intermediate results. Underflow, overflow, and rounding
4912 exceptions are most likely to occur during intermediate computations rather
4913 than during calculation of an expression's final result. Holding
4914 intermediate results in extended-precision real format greatly reduces the
4915 likelihood of overflow and underflow and eliminates roundoff as a serious
4916 source of error until the final assignment of the result is performed.
4918 The compiler generates 80387 code to evaluate expressions that contain REAL
4919 data types, whether variables or constants or both. This means that
4920 addition, subtraction, multiplication, division, comparison, and assignment
4921 of REALs will be performed by the NPX. INTEGER expressions, on the other
4922 hand, are evaluated on the CPU.
4924 Five built-in procedures (Table 5-1) give the PL/M-386 programmer access to
4925 80387 functions manipulated by the processor control instructions. Prior to
4926 any arithmetic operations, a typical PL/M-386 program will set up the NPX
4927 using the INIT$REAL$MATH$UNIT procedure and then issue SET$REAL$MODE to
4928 configure the NPX. SET$REAL$MODE loads the 80387 control word, and its
4929 16-bit parameter has the format shown for the control word in Chapter 2.
4930 The recommended value of this parameter is 033EH (round to nearest, 64-bit
4931 precision, all exceptions masked except invalid operation). Other settings
4932 may be used at the programmer's discretion.
4934 If any exceptions are unmasked, an exception handler must be provided in
4935 the form of an interrupt procedure that is designated to be invoked via CPU
4936 interrupt vector number 16. The exception handler can use the GET$REAL$ERROR
4937 procedure to obtain the low-order byte of the 80387 status word and to then
4938 clear the exception flags. The byte returned by GET$REAL$ERROR contains the
4939 exception flags; these can be examined to determine the source of the
4942 The SAVE$REAL$STATUS and RESTORE$REAL$STATUS procedures are provided
4943 for multitasking environments where a running task that uses the 80387 may
4944 be preempted by another task that also uses the 80387. It is the
4945 responsibility of the operating system to issue SAVE$REAL$STATUS before it
4946 executes any statements that affect the 80387; these include the
4947 INIT$REAL$MATH$UNIT and SET$REAL$MODE procedures as well as arithmetic
4948 expressions. SAVE$REAL$STATUS saves the 80387 state (registers, status, and
4949 control words, etc.) on the CPU's stack. RESTORE$REAL$STATUS reloads the
4950 state information; the preempting task must invoke this procedure before
4951 terminating in order to restore the 80387 to its state at the time the
4952 running task was preempted. This enables the preempted task to resume
4953 execution from the point of its preemption.
4956 Table 5-1. PL/M-386 Built-In Procedures
4958 Procedure 80387 Description
4961 INIT$REAL$MATH$UNIT FINIT Initialize processor.
4962 SET$REAL$MODE FLDCW Set exception masks, rounding
4963 precision, and infinity controls.
4964 GET$REAL$ERROR FNSTSW Store, then clear, exception flags.
4966 SAVE$REAL$STATUS FNSAVE Save processor state.
4967 RESTORE$REAL$STATUS FRSTOR Restore processor state.
4972 The ASM386 assembly language provides programmers with complete access to
4973 all of the facilities of the 80386 and 80387 processors.
4975 The programmer's view of the 80386/80387 hardware is a single machine with
4980 Ž 8 general registers
4981 Ž 6 segment registers
4982 Ž 8 floating-point registers, organized as a stack
4985 5.1.4.1 Defining Data
4987 The ASM386 directives shown in Table 5-2 allocate storage for 80387
4988 variables and constants. As with other storage allocation directives, the
4989 assembler associates a type with any variable defined with these directives.
4990 The type value is equal to the length of the storage unit in bytes (10 for
4991 DT, 8 for DQ, etc.). The assembler checks the type of any variable coded in
4992 an instruction to be certain that it is compatible with the instruction.
4993 For example, the coding FIADD ALPHA will be flagged as an error if ALPHA's
4994 type is not 2 or 4, because integer addition is only available for word and
4995 short integer (doubleword) data types. The operand's type also tells the
4996 assembler which machine instruction to produce; although to the programmer
4997 there is only an FIADD instruction, a different machine instruction is
4998 required for each operand type.
5000 On occasion it is desirable to use an instruction with an operand that has
5001 no declared type. For example, if register BX points to a short integer
5002 variable, a programmer may want to code FIADD [BX]. This can be done by
5003 informing the assembler of the operand's type in the instruction, coding
5004 FIADD DWORD PTR [BX]. The corresponding overrides for the other storage
5005 allocations are WORD PTR, QWORD PTR, and TBYTE PTR.
5007 The assembler does not, however, check the types of operands used in
5008 processor control instructions. Coding FRSTOR [BP] implies that the
5009 programmer has set up register BP to point to the location (probably in the
5010 stack) where the processor's 94-byte state record has been previously saved.
5012 The initial values for 80387 constants may be coded in several different
5013 ways. Binary integer constants may be specified as bit strings, decimal
5014 integers, octal integers, or hexadecimal strings. Packed decimal values are
5015 normally written as decimal integers, although the assembler will accept and
5016 convert other representations of integers. Real values may be written as
5017 ordinary decimal real numbers (decimal point required), as decimal numbers
5018 in scientific notation, or as hexadecimal strings. Using hexadecimal strings
5019 is primarily intended for defining special values such as infinities, NaNs,
5020 and denormalized numbers. Most programmers will find that ordinary decimal
5021 and scientific decimal provide the simplest way to initialize 80387
5022 constants. Figure 5-2 compares several ways of setting the various 80387
5023 data types to the same initial value.
5025 Note that preceding 80387 variables and constants with the ASM386 EVEN
5026 directive ensures that the operands will be word-aligned in memory. The best
5027 performance is obtained when data transfers are double-word aligned. All
5028 80387 data types occupy integral numbers of words so that no storage is
5029 "wasted" if blocks of variables are defined together and preceded by a
5030 single EVEN declarative.
5033 Table 5-2. ASM386 Storage Allocation Directives
5035 Directive Interpretation Data Types
5037 DW Define Word Word integer
5038 DD Define Doubleword Short integer, short real
5039 DQ Dfine Quadword Long integer, long real
5040 DT Define Tenbyte Packed decimal, temporary real
5043 Figure 5-2. Sample 80387 Constants
5045 ; THE FOLLOWING ALL ALLOCATE THE CONSTANT: -126
5046 ; NOTE TWO'S COMPLETE STORAGE OF NEGATIVE BINARY INTEGERS.
5048 ; EVEN ; FORCE WORD ALIGNMENT
5049 WORD_INTEGER DW 111111111000010B ; BIT STRING
5050 SHORT_INTEGER DD 0FFFFFF82H ; HEX STRING MUST START
5052 LONG_INTEGER DQ -126 ; ORDINARY DECIMAL
5053 SINGLE_REAL DD -126.0 ; NOTE PRESENCE OF '.'
5054 DOUBLE_REAL DD -1.26E2 ; "SCIENTIFIC"
5055 PACKED_DECIMAL DT -126 ; ORDINARY DECIMAL INTEGER
5057 ; IN THE FOLLOWING, SIGN AND EXPONENT IS 'C005'
5058 ; SIGNIFICAND IS '7E00...00', 'R' INFORMS ASSEMBLER THAT
5059 ; THE STRING REPRESENTS A REAL DATA TYPE.
5061 EXTENDED_REAL DT 0C0057E00000000000000R ; HEX STRING
5064 5.1.4.2 Records and Structures
5066 The ASM386 RECORD and STRUC (structure) declaratives can be very useful in
5067 NPX programming. The record facility can be used to define the bit fields of
5068 the control, status, and tag words. Figure 5-3 shows one definition of the
5069 status word and how it might be used in a routine that polls the 80387 until
5070 it has completed an instruction.
5072 Because structures allow different but related data types to be grouped
5073 together, they often provide a natural way to represent "real world" data
5074 organizations. The fact that the structure template may be "moved" about in
5075 memory adds to its flexibility. Figure 5-4 shows a simple structure that
5076 might be used to represent data consisting of a series of test score
5077 samples. A structure could also be used to define the organization of the
5078 information stored and loaded by the FSTENV and FLDENV instructions.
5081 Figure 5-3. Status Word Record Definition
5083 ; RESERVE SPACE FOR STATUS WORD
5085 ; LAY OUT STATUS WORD FIELDS
5101 ; REDUCE UNTIL COMPLETE
5104 TEST STATUS_WORD, MASK_COND_CODE2
5108 Figure 5-4. Structure Definition
5111 N_OBS DD ? ; SHORT INTEGER
5112 MEAN DQ ? ; DOUBLE REAL
5113 MODE DW ? ; WORD INTEGER
5114 STD_DEV DQ ? ; DOUBLE REAL
5115 ; ARRAY OF OBSERVATIONS -- WORD INTEGER
5116 TEST_SCORES DW 1000 DUP (?)
5120 5.1.4.3 Addressing Methods
5122 80387 memory data can be accessed with any of the memory addressing methods
5123 provided by the ModR/M byte and (optionally) the SIB byte. This means that
5124 80387 data types can be incorporated in data aggregates ranging from simple
5125 to complex according to the needs of the application. The addressing methods
5126 and the ASM386 notation used to specify them in instructions make the
5127 accessing of structures, arrays, arrays of structures, and other
5128 organizations direct and straightforward. Table 5-3 gives several examples
5129 of 80387 instructions coded with operands that illustrate different
5133 Table 5-3. Addressing Method Examples
5135 Coding Interpretation
5137 FIADD ALPHA ALPHA is a simple scalar (mode is direct).
5139 FDIVR ALPHA.BETA BETA is a field in a structure that is
5140 "overlaid" on ALPHA (mode is direct).
5142 FMUL QWORD PTR [BX] BX contains the address of a long real
5143 variable (mode is register indirect).
5145 FSUB ALPHA [SI] ALPHA is an array and SI contains the
5146 offset of an array element from the start of
5147 the array (mode is indexed).
5149 FILD [BP].BETA BP contains the address of a structure on
5150 the CPU stack and BETA is a field in the
5151 structure (mode is based).
5153 FBLD TBYTE PTR [BX] [DI] BX contains the address of a packed
5154 decimal array and DI contains the offset of
5155 an array element (mode is based indexed).
5158 5.1.5 Comparative Programming Example
5160 Figures 5-5 and 5-6 show the PL/M-386 and ASM386 code for a simple 80387
5161 program, called ARRSUM. The program references an array (X$ARRAY), which
5162 contains 0-100 single real values; the integer variable N$OF$X indicates the
5163 number of array elements the program is to consider. ARRSUM steps through
5164 X$ARRAY accumulating three sums:
5166 Ž SUM$X, the sum of the array values
5168 Ž SUM$INDEXES, the sum of each array value times its index, where the
5169 index of the first element is 1, the second is 2, etc.
5171 Ž SUM$SQUARES, the sum of each array element squared
5173 (A true program, of course, would go beyond these steps to store and use
5174 the results of these calculations.) The control word is set with the
5175 recommended values: round to nearest, 64-bit precision, interrupts enabled,
5176 and all exceptions masked except invalid operation. It is assumed that an
5177 exception handler has been written to field the invalid operation if it
5178 occurs, and that it is invoked by interrupt pointer 16. Either version of
5179 the program will run on an actual or an emulated 80387 without altering the
5182 The PL/M-386 version of ARRSUM (Figure 5-5) is very straightforward and
5183 illustrates how easily the 80387 can be used in this language. After
5184 declaring variables, the program calls built-in procedures to initialize the
5185 processor (or its emulator) and to load to the control word. The program
5186 clears the sum variables and then steps through X$ARRAY with a DO-loop. The
5187 loop control takes into account PL/M-386's practice of considering the
5188 index of the first element of an array to be 0. In the computation of
5189 SUM$INDEXES, the built-in procedure FLOAT converts I+1 from integer to real
5190 because the language does not support "mixed mode" arithmetic. One of the
5191 strengths of the NPX, of course, is that it does support arithmetic on mixed
5192 data types (because all values are converted internally to the 80-bit
5193 extended-precision real format).
5195 The ASM386 version (Figure 5-6) defines the external procedure INIT387,
5196 which makes the different initialization requirements of the processor and
5197 its emulator transparent to the source code. After defining the data and
5198 setting up the segment registers and stack pointer, the program calls
5199 INIT387 and loads the control word. The computation begins with the next
5200 three instructions, which clear three registers by loading (pushing) zeros
5201 onto the stack. As shown in Figure 5-7, these registers remain at the
5202 bottom of the stack throughout the computation while temporary values are
5203 pushed on and popped off the stack above them.
5205 The program uses the CPU LOOP instruction to control its iteration through
5206 X_ARRAY; register ECX, which LOOP automatically decrements, is loaded with
5207 N_OF_X, the number of array elements to be summed. Register ESI is used to
5208 select (index) the array elements. The program steps through X_ARRAY from
5209 back to front, so ESI is initialized to point at the element just beyond the
5210 first element to be processed. The ASM386 TYPE operator is used to determine
5211 the number of bytes in each array element. This permits changing X_ARRAY to
5212 a double-precision real array by simply changing its definition (DD to DQ)
5215 Figure 5-7 shows the effect of the instructions in the program loop on the
5216 NPX register stack. The figure assumes that the program is in its first
5217 iteration, that N_OF_X is 20, and that X_ARRAY(19) (the 20th element)
5218 contains the value 2.5. When the loop terminates, the three sums are left as
5219 the top stack elements so that the program ends by simply popping them into
5223 Figure 5-5. Sample PL/M-386 Program
5225 XENIX286 PL/M-386 DEBUG X291a COMPILATION OF MODULE ARRAYSUM
5226 OBJECT MODULE PLACED IN arraysum.obj
5227 COMPILER INVOKED BY: plm386 arraysum.plm
5230 /***********************************************************
5232 * ARRAYSUM MODDULE *
5234 ***********************************************************/
5238 2 1 declare (sum$x, sum$indexes, sum$squares) real;
5239 3 1 declare x$array(100) real;
5240 4 1 declare (n$of$x, i) integer;
5241 5 1 declare control$387 literally `033eh';
5243 /* Assume x$array and n$of$x are initialized */
5244 6 1 call init$real$math$unit;
5245 7 1 call set$real$mode(control$387);
5248 8 1 sum$x, sum$indexes, sum$squares = 0.0;
5250 /* Loop through array, accumulating sums */
5251 9 1 do i = 0 to n$of$x - 1;
5252 10 2 sum$x = sum$x + x$array(i);
5253 11 2 sum$indexes = sum$indexes + (x$array(i)*float(i+1));
5254 12 2 sum$squares = sum$squares + (x$array(i)*x$array(i));
5264 CODE AREA SIZE = 000000A0H 160D
5265 CONSTANT AREA SIZE = 00000004H 4D
5266 VARIABLE AREA SIZE = 000001A4H 420D
5267 MAXIMUM STACK SIZE = 00000004H 4D
5277 END OF PL/M-386 COMPILATION
5280 Figure 5-6. Sample ASM386 Program
5282 XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE ARRAYSUM
5283 OBJECT MODULE PLACED IN arraysum.obj
5284 ASSEMBLER INVOKED BY: asm386 arraysum.asm
5290 3 ; Define initialization routine
5294 7 ; Allocate space for data
5296 -------- 9 data segment rw public
5297 00000000 3E03 10 control_387 dw 033eh
5298 00000002 ???????? 11 n_of_x dd ?
5299 00000006 (100 12 x_array cd 100 dup (?)
5302 00000196 ???????? 13 sum_squares dd ?
5303 0000019A ???????? 14 sum_indexes dd ?
5304 0000019E ???????? 15 sum_x dd ?
5305 -------- 16 data ends
5307 18 ; Allocate CPU stack space
5309 -------- 20 stack stackseg 400
5313 -------- 24 code segment er public
5315 26 assume ds:data, ss:stack
5318 00000000 66B8---- R 29 mov ax, data
5319 00000004 8ED8 30 mov ds, ax
5320 00000006 66B8---- R 31 mov ax, stack
5321 0000000A B800000000 32 mov eax, 0h
5322 0000000F 8E00 33 mov ss, ax
5323 00000011 BC00000000 R 34 mov esp, stackstart stack
5325 36 ; Assume x_array and n_of_x have
5326 37 ; been initialized
5328 39 ; Prepare the 80387 or its emulator
5330 00000016 9A00000000---- E 41 call init387
5331 0000001D D92D00000000 R 42 fldcw control_387
5333 44 ; Clear three registers to hold
5336 00000023 D9EE 47 fldz
5337 00000025 D9EE 48 fldz
5338 00000027 D9EE 49 fldz
5340 51 ; Setup ECX as loop counter and ESI
5341 52 ; as index into x array
5343 00000029 8B0D02000000 R 54 mov ecx, n of x
5344 0000002F F7E9 55 imul ecx
5345 00000031 8BF0 56 mov esi, eax
5347 58 ; ESI now contains index of last
5349 60 ; Loop through x_array and
5352 00000033 43 sum_next:
5353 64 ; backup one element and push on
5356 00000033 83EE04 67 sub esi, type x_array
5357 00000036 D98606000000 R 68 fld x_array[esi]
5359 70 ; add to the sum and duplicate x
5362 0000003C DCC3 73 fadd st(3), st
5363 0000003E D9C0 74 fld st
5365 76 ; square it and add into the sum of
5366 77 ; (index+1) and discard
5368 00000040 DCC8 79 fmul st, st
5369 00000042 DEC2 80 facdp st(2), st
5371 82 ; reduce index for next iteration
5373 00000044 FF0D02000000 R 84 dec n_of_x
5374 0000004A E2E7 85 loop sum_next
5376 87 ; Pop sums into memory
5378 0000004C 89 pop_results:
5379 0000004C D91D96010000 R 90 fstp sum_squares
5380 00000052 D91D9A010000 R 91 fstp sum_indexes
5381 00000058 D91D9E010000 R 92 fstp sum_x
5382 0000005E 9B 93 fwait
5387 -------- 98 code ends
5388 99 end start, ds:data, ss:stack
5390 ASSEMBLY COMPLETE, NO WARNINGS, NO ERRORS.
5393 Figure 5-7. Instructions and Register Stack
5395 FLDZ, FLDZ, FLDZ FLD X_ARRAY[SI]
5396 ‚��������������ƒ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘
\x10‚��������������ƒ
5397 ST(0)€ 0.0 € SUM_SQUARES ST(O)€ 2.5 € X_ARRAY(19)
5398 †��������������‡ †��������������‡
5399 ST(1)€ 0.0 € SUM_INDEXES ST(1)€ € SUM_SQUARES
5400 †��������������‡ †��������������‡
5401 ST(2)€ 0.0 € SUM_X ST(2)€ 0.0 € SUM_INDEXES
5402 „��������������… †��������������‡
5404 ’ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ „��������������…
5406 FADD_ST(3), ST
\x11‘• FLD_ST
5407 ‚��������������ƒ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘
\x10‚��������������ƒ
5408 ST(O)€ 2.5 € X_ARRAY(19) ST(O)€ 2.5 € X_ARRAY(19)
5409 †��������������‡ †��������������‡
5410 ST(1)€ 0.0 € SUM_SQUARES ST(1)€ 2.5 € X_ARRAY(19)
5411 †��������������‡ †��������������‡
5412 ST(2)€ 0.0 € SUM_INDEXES ST(2)€ 0.0 € SUM_SQUARES
5413 †��������������‡ †��������������‡
5414 ST(3)€ 2.5 € SUM_X ST(3)€ 0.0 € SUM_INDEXES
5415 „��������������… †��������������‡
5417 ’ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ „��������������…
5419 FMUL_ST, ST
\x11‘‘• FADDP_ST(2), ST
5420 ‚��������������ƒ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘
\x10‚��������������ƒ
5421 ST(0)€ 6.25 € X_ARRAY(19)² ST(O)€ 2.5 € X_ARRAY(19)
5422 †��������������‡ †��������������‡
5423 ST(1)€ 2.5 € X_ARRAY(19) ST(1)€ 6.25 € SUM_SQUARES
5424 †��������������‡ †��������������‡
5425 ST(2)€ 0.0 € SUM_SQUARES ST(2)€ 0.0 € SUM_INDEXES
5426 †��������������‡ †��������������‡
5427 ST(3)€ 0.0 € SUM_INDEXES ST(3)€ 2.5 € SUM_X
5428 †��������������‡ „��������������…
5429 ST(4)€ 2.5 € SUM_X �
5431 ’ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ •
5432 FIMUL N_OF_X
\x11‘‘• FADDP_ST(2), ST
5433 ‚��������������ƒ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘
\x10‚��������������ƒ
5434 ST(O)€ 50.0 € X_ARRAY(19)*20 ST(O)€ 6.25 € SUM_SQUARES
5435 †��������������‡ †��������������‡
5436 ST(1)€ 6.25 € SUM_SQUARES ST(1)€ 50.0 € SUM_INDEXES
5437 †��������������‡ †��������������‡
5438 ST(2)€ 0.0 € SUM_INDEXES ST(2)€ 2.5 € SUM_X
5439 †��������������‡ „��������������…
5444 5.1.6 80387 Emulation
5446 The programming of applications to execute on both 80386 with an 80387 and
5447 80386 systems without an 80387 is made much easier by the existence of an
5448 80387 emulator for 80386 systems. The Intel EMUL387 emulator offers a
5449 complete software counterpart to the 80387 hardware; NPX instructions can be
5450 simply emulated in software rather than being executed in hardware. With
5451 software emulation, the distinction between 80386 systems with or without an
5452 80387 is reduced to a simple performance differential. Identical numeric
5453 programs will simply execute more slowly (using software emulation of NPX
5454 instructions) on 80386 systems without an 80387 than on an 80386/80387
5455 system executing NPX instructions directly.
5457 When incorporated into the systems software, the emulation of NPX
5458 instructions on the 80386 systems is completely transparent to the
5459 applications programmer. Applications software needs no special libraries,
5460 linking, or other activity to allow it to run on an 80386 with 80387
5463 To the applications programmer, the development of programs for 80386
5464 systems is the same whether the 80387 NPX hardware is available or not. The
5465 full 80387 instruction set is available for use, with NPX instructions being
5466 either emulated or executed directly. Applications programmers need not be
5467 concerned with the hardware configuration of the computer systems on which
5468 their applications will eventually run.
5470 For systems programmers, details relating to 80387 emulators are described
5473 The EMUL387 software emulator for 80386 systems is available from Intel as
5474 a separate program product.
5477 5.2 Concurrent Processing With the 80387
5479 Because the 80386 CPU and the 80387 NPX have separate execution units, it
5480 is possible for the NPX to execute numeric instructions in parallel with
5481 instructions executed by the CPU. This simultaneous execution of different
5482 instructions is called concurrency.
5484 No special programming techniques are required to gain the advantages of
5485 concurrent execution; numeric instructions for the NPX are simply placed in
5486 line with the instructions for the CPU. CPU and numeric instructions are
5487 initiated in the same order as they are encountered by the CPU in its
5488 instruction stream. However, because numeric operations performed by the NPX
5489 generally require more time than operations performed by the CPU, the CPU
5490 can often execute several of its instructions before the NPX completes a
5491 numeric instruction previously initiated.
5493 This concurrency offers obvious advantages in terms of execution
5494 performance, but concurrency also imposes several rules that must be
5495 observed in order to assure proper synchronization of the 80386 CPU and
5498 All Intel high-level languages automatically provide for and manage
5499 concurrency in the NPX. Assembly-language programmers, however, must
5500 understand and manage some areas of concurrency in exchange for the
5501 flexibility and performance of programming in assembly language. This
5502 section is for the assembly-language programmer or well-informed
5503 high-level-language programmer.
5506 5.2.1 Managing Concurrency
5508 Concurrent execution of the host and 80387 is easy to establish and
5509 maintain. The activities of numeric programs can be split into two major
5510 areas: program control and arithmetic. The program control part performs
5511 activities such as deciding what functions to perform, calculating addresses
5512 of numeric operands, and loop control. The arithmetic part simply adds,
5513 subtracts, multiplies, and performs other operations on the numeric
5514 operands. The NPX and host are designed to handle these two parts separately
5517 Concurrency management is required to check for an exception before letting
5518 the 80386 change a value just used by the 80387. Almost any numeric
5519 instruction can, under the wrong circumstances, produce a numeric exception.
5520 For programmers in higher-level languages, all required synchronization is
5521 automatically provided by the appropriate compiler. For assembly-language
5522 programmers exception synchronization remains the responsibility of the
5523 assembly-language programmer.
5525 A complication is that a programmer may not expect his numeric program to
5526 cause numeric exceptions, but in some systems, they may regularly happen. To
5527 better understand these points, consider what can happen when the NPX
5528 detects an exception.
5530 Depending on options determined by the software system designer, the NPX
5531 can perform one of two things when a numeric exception occurs:
5533 Ž The NPX can provide a default fix-up for selected numeric exceptions.
5534 Programs can mask individual exception types to indicate that the NPX
5535 should generate a safe, reasonable result whenever that exception
5536 occurs. The default exception fix-up activity is treated by the NPX as
5537 part of the instruction causing the exception; no external indication
5538 of the exception is given. When exceptions are detected, a flag is set
5539 in the numeric status register, but no information regarding where or
5540 when is available. If the NPX performs its default action for all
5541 exceptions, then the need for exception synchronization is not
5542 manifest. However, as will be shown later, this is not sufficient
5543 reason to ignore exception synchronization when designing programs that
5546 Ž As an alternative to the NPX default fix-up of numeric exceptions, the
5547 80386 CPU can be notified whenever an exception occurs. When a numeric
5548 exception is unmasked and the exception occurs, the NPX stops further
5549 execution of the numeric instruction and signals this event to the CPU.
5550 On the next occurrence of an ESC or WAIT instruction, the CPU traps to
5551 a software exception handler. The exception handler can then implement
5552 any sort of recovery procedures desired for any numeric exception
5553 detectable by the NPX. Some ESC instructions do not check for
5554 exceptions. These are the nonwaiting forms FNINIT, FNSTENV, FNSAVE,
5555 FNSTSW, FNSTCW, and FNCLEX.
5557 When the NPX signals an unmasked exception condition, it is requesting
5558 help. The fact that the exception was unmasked indicates that further
5559 numeric program execution under the arithmetic and programming rules of the
5560 NPX is unreasonable.
5562 If concurrent execution is allowed, the state of the CPU when it recognizes
5563 the exception is undefined. The CPU may have changed many of its internal
5564 registers and be executing a totally different program by the time the
5565 exception occurs. To handle this situation, the NPX has special registers
5566 updated at the start of each numeric instruction to describe the state of
5567 the numeric program when the failed instruction was attempted.
5569 Exception synchronization ensures that the NPX is in a well-defined state
5570 after an unmasked numeric exception occurs. Without a well-defined state, it
5571 would be impossible for exception recovery routines to determine why the
5572 numeric exception occurred, or to recover successfully from the exception.
5574 The following two sections illustrate the need to always consider
5575 exception synchronization when writing 80387 code, even when the code is
5576 initially intended for execution with exceptions masked. If the code is
5577 later moved to an environment where exceptions are unmasked, the same code
5578 may not work correctly. An example of how some instructions written without
5579 exception synchronization will work initially, but fail when moved into a
5580 new environment is shown in Figure 5-8.
5583 Figure 5-8. Exception Synchronization Examples
5585 INCORRECT ERROR SYNCHRONIZATION
5587 FILD COUNT ; NPX instruction
5588 INC COUNT ; CPU instruction alters operand
5589 FSQRT COUNT ; subsequent NPX instruction -- error from
5590 ; previous NPX instruction detected here
5592 PROPER ERROR SYNCHRONIZATION
5594 FILD COUNT ; NPX instruction
5595 FSQRT ; subsequent NPX instruction -- error from
5596 ; previous NPX instruction detected here
5597 INC COUNT ; CPU instruction alters operand
5600 5.2.1.1 Incorrect Exception Synchronization
5602 In Figure 5-8, three instructions are shown to load an integer, calculate
5603 its square root, then increment the integer. The 80386-to-80387 interface
5604 and synchronous execution of the NPX emulator will allow this program to
5605 execute correctly when no exceptions occur on the FILD instruction.
5607 This situation changes if the 80387 numeric register stack is extended to
5608 memory. To extend the NPX stack to memory, the invalid exception is
5609 unmasked. A push to a full register or pop from an empty register sets SF
5610 and causes an invalid exception.
5612 The recovery routine for the exception must recognize this situation, fix
5613 up the stack, then perform the original operation. The recovery routine
5614 will not work correctly in the first example shown in the figure. The
5615 problem is that the value of COUNT is incremented before the NPX can signal
5616 the exception to the CPU. Because COUNT is incremented before the exception
5617 handler is invoked, the recovery routine will load an incorrect value of
5618 COUNT, causing the program to fail or behave unreliably.
5621 5.2.1.2 Proper Exception Synchronization
5623 Exception synchronization relies on the WAIT instruction and the BUSY# and
5624 ERROR# signals of the 80387. When an unmasked exception occurs in the 80387,
5625 it asserts the ERROR# signal, signaling to the CPU that a numeric exception
5626 has occurred. The next time the CPU encounters a WAIT instruction or an
5627 exception-checking ESC instruction, the CPU acknowledges the ERROR# signal
5628 by trapping automatically to Interrupt #16, the processor-extension
5629 exception vector. If the following ESC or WAIT instruction is properly
5630 placed, the CPU will not yet have disturbed any information vital to
5631 recovery from the exception.
5634 Chapter 6 System-Level Numeric Programming
5636 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
5638 System programming for 80387 systems requires a more detailed understanding
5639 of the 80387 NPX than does application programming. Such things as
5640 emulation, initialization, exception handling, and data and error
5641 synchronization are all the responsibility of the systems programmer. These
5642 topics are covered in detail in the sections that follow.
5645 6.1 80386/80387 Architecture
5647 On a software level, the 80387 NPX appears as an extension of the 80386
5648 CPU. On the hardware level, however, the mechanisms by which the 80386 and
5649 80387 interact are more complex. This section describes how the 80387 NPX
5650 and 80386 CPU interact and points out features of this interaction that are
5651 of interest to systems programmers.
5654 6.1.1 Instruction and Operand Transfer
5656 All transfers of instructions and operands between the 80387 and system
5657 memory are performed by the 80386 using I/O bus cycles. The 80387 appears to
5658 the CPU as a special peripheral device. It is special in two respects: the
5659 CPU initiates I/O automatically when it encounters ESC instructions, and the
5660 CPU uses reserved I/O addresses to communicate with the 80387. These I/O
5661 operations are completely transparent to software.
5663 Because the 80386 actually performs all transfers between the 80387 and
5664 memory, no additional bus drivers, controllers, or other components are
5665 necessary to interface the 80387 NPX to the local bus. The 80387 can utilize
5666 instructions and operands located in any memory accessible to the 80386 CPU.
5669 6.1.2 Independent of CPU Addressing Modes
5671 Unlike the 80287, the 80387 is not sensitive to the addressing and memory
5672 management of the CPU. The 80387 operates the same regardless of whether the
5673 80386 CPU is operating in real-address mode, in protected mode, or in
5676 The instruction FSETPM that was necessary in 80286/80287 systems to set the
5677 80287 into protected mode is not needed for the 80387. The 80387 treats this
5678 instruction as a no-op.
5680 Because the 80386 actually performs all transfers between the 80387 and
5681 memory, 80387 instructions can utilize any memory location accessible by the
5682 task currently executing on the 80386. When operating in protected mode, all
5683 references to memory operands are automatically verified by the 80386's
5684 memory management and protection mechanisms as for any other memory
5685 references by the currently-executing task. Protection violations associated
5686 with NPX instructions automatically cause the 80386 to trap to an
5687 appropriate exception handler.
5689 To the numerics programmer, the operating modes of the 80386 affect only
5690 the manner in which the NPX instruction and data pointers are represented in
5691 memory following an FSAVE or FSTENV instruction. Each of these instructions
5692 produces one of four formats depending on both the operating mode and on the
5693 operand-size attribute in effect for the instruction. The differences are
5694 detailed in the discussion of the FSAVE and FSTENV instructions in
5698 6.1.3 Dedicated I/O Locations
5700 The 80387 NPX does not require that any memory addresses be set aside for
5701 special purposes. The 80387 does make use of I/O port addresses, but these
5702 are 32-bit addresses with the high-order bit set (i.e. > 80000000H);
5703 therefore, these I/O operations are completely transparent to the 80386
5704 software. Because these addresses are beyond the 64 Kbyte I/O addressing
5705 limit of I/O instructions, 80386 programs cannot reference these reserved
5706 I/O addresses directly.
5709 6.2 Processor Initialization and Control
5711 One of the principal responsibilities of systems software is the
5712 initialization, monitoring, and control of the hardware and software
5713 resources of the system, including the 80387 NPX. In this section, issues
5714 related to system initialization and control are described, including
5715 recognition of the NPX, emulation of the 80387 NPX in software if the
5716 hardware is not available, and the handling of exceptions that may occur
5717 during the execution of the 80387.
5720 6.2.1 System Initialization
5722 During initialization of an 80386 system, systems software must
5724 Ž Recognize the presence or absence of the NPX.
5726 Ž Set flags in the 80386 MSW to reflect the state of the numeric
5729 If an 80387 NPX is present in the system, the NPX must be initialized. All
5730 of these activities can be quickly and easily performed as part of the
5731 overall system initialization.
5734 6.2.2 Hardware Recognition of the NPX
5736 The 80386 identifies the type of its coprocessor (80287 or 80387) by
5737 sampling its ERROR# input some time after the falling edge of RESET and
5738 before executing the first ESC instruction. The 80287 keeps its ERROR#
5739 output in inactive state after hardware reset; the 80387 keeps its ERROR#
5740 output in active state after hardware reset. The 80386 records this
5741 difference in the ET bit of control register zero (CR0). The 80386
5742 subsequently uses ET to control its interface with the coprocessor. If ET is
5743 set, it employs the 32-bit protocol of the 80387; if ET is not set, it
5744 employs the 16-bit protocol of the 80287.
5746 Systems software can (if necessary) change the value of ET. There are three
5747 reasons that ET may not be set:
5749 1. An 80287 is actually present.
5751 2. No coprocessor is present.
5753 3. An 80387 is present but it is connected in a nonstandard manner that
5754 does not trigger the setting of ET.
5756 An example of case three is the PC/AT-compatible design described in
5757 Appendix F. In such cases, initialization software may need to change the
5761 6.2.3 Software Recognition of the NPX
5763 Figure 6-1 shows an example of a recognition routine that determines
5764 whether an NPX is present, and distinguishes between the 80387 and the
5765 8087/80287. This routine can be executed on any 80386, 80286, or 8086
5766 hardware configuration that has an NPX socket.
5768 The example guards against the possibility of accidentally reading an
5769 expected value from a floating data bus when no NPX is present. Data read
5770 from a floating bus is undefined. By expecting to read a specific bit
5771 pattern from the NPX, the routine protects itself from the indeterminate
5772 state of the bus. The example also avoids depending on any values in
5773 reserved bits, thereby maintaining compatibility with future numerics
5777 Figure 6-1. Software Routine to Recognize the 80287
5779 8086/87/88/186 MACRO ASSEMBLER Test for presence of a Numerics Chip, Revision 1.0
5782 DOS 3.20 (033-N) 8086/87/88/186 MACRO ASSEMBLER V2.0 ASSEMBLY OF MODULE TEST_NPX
5783 OBJECT MODULE PLACED IN FINDNPX.OBJ
5787 1 +1 $title('Test for presence of a Numerics Chip, Revision 1.0')
5791 ---- 5 stack segment stack 'stack'
5792 0000 (100 6 dw 100 dup (?)
5795 00C8 ???? 7 sst dw ?
5798 ---- 10 data segment public 'data'
5799 0000 0000 11 temp dw 0h
5802 14 dgroup group data, stack
5803 15 cgroup group code
5805 ---- 17 code segment public 'code'
5806 18 assume cs:cgroup, ds:dgroup
5810 22 ; Look for an 8087, 80287, or 80387 NPX.
5811 23 ; Note that we cannot execute WAIT on 8086/88 if no 8087 is present.
5814 0000 90DBE3 26 fninit ; Must use non-wait form
5815 0003 BE0000 R 27 mov [si],offset dgroup:temp
5816 0006 C7045A5A 28 mov word ptr [si],5A5AH ; Initialize temp to non-zero value
5817 000A 90DD3C 29 fnstsw [si] ; Must use non-wait form of fstsw
5818 30 ; It is not necessary to use a WAIT instruction
5819 31 ; after fnstsw or fnstcw. Do not use one here.
5820 000D 803C00 32 cmp byte ptr [si],0 ; See if correct status with zeroes was read
5821 0010 752A 33 jne no_npx ; Jump if not a valid status word, meaning no NPX
5823 35 ; Now see if ones can be correctly written from the control word.
5825 0012 90D93C 37 fnstcw [si] ; Look at the control word; do not use WAIT form
5826 38 ; Do not use a WAIT instruction here!
5827 0015 8B04 39 mov ax,[si] ; See if ones can be written by NPX
5828 0017 253F10 40 and ax,103fh ; See if selected parts of control word look OK
5829 001A 3D3F00 41 cmp ax,3fh ; Check that ones and zeroes were correctly read
5830 001D 7510 42 jne no npx ; Jump if no NPX is installed
5832 44 ; Some numerics chip is installed. NPX instructions and WAIT are now safe.
5833 45 ; See if the NPX is an 8087, 80287, or 80387.
5834 46 ; This code is necessary if a denormal exception handler is used or the
5835 47 ; new 80387 instructions will be used.
5837 001F 98D9E8 49 fld1 ; Must use default control word from FNINIT
5838 0022 9BD9EE 50 fldz ; Form infinity
5839 0025 9BDEF9 51 fdiv ; 8087/287 says +inf = .inf
5840 0028 9BD9C0 52 fld st ; Form negative infinity
5841 002B 9BD9E0 53 fchs ; 80387 says +inf <> -inf
5842 002E 9BDED9 54 fcompp ; See if they are the same and remove them
5843 0031 9BDD3C 55 fstsw [si] ; Look at status from FCOMPP
5844 0034 8B04 56 mov ax,[si]
5845 0036 9E 57 sahf ; See if the infinities matched
5846 0037 7406 58 je found_87_287 ; Jump if 8087/287 is present
5848 60 ; An 80387 is present. If denormal exceptions are used for an 8087/287,
5849 61 ; they must be masked. The 80387 will automatically normalize denormal
5850 62 ; operands faster than an exception handler can.
5852 0039 EB0790 64 jmp found_387
5854 66 ; set up for no NPX
5857 003C EB0490 69 jmp exit
5858 003F 70 found_87_287:
5859 71 ; set up for 87/287
5862 003F EB0190 74 jmp exit
5869 81 end start,ds:dgroup,ss:dgroup:sst
5871 ASSEMBLY COMPLETE, NO ERRORS FOUND
5874 6.2.4 Configuring the Numerics Environment
5876 Once the 80386 CPU has determined the presence or absence of the 80387 or
5877 80287 NPX, the 80386 must set either the MP or the EM bit in its own control
5878 register zero (CR0) accordingly. The initialization routine can either
5880 Ž Set the MP bit in CR0 to allow numeric instructions to be executed
5881 directly by the NPX.
5883 Ž Set the EM bit in the CR0 to permit software emulation of the numeric
5886 The MP (monitor coprocessor) flag of CR0 indicates to the 80386 whether an
5887 NPX is physically available in the system. The MP flag controls the function
5888 of the WAIT instruction. When executing a WAIT instruction, the 80386 tests
5889 the task switched (TS) bit only if MP is set; if it finds TS set under these
5890 conditions, the CPU traps to exception #7.
5892 The Emulation Mode (EM) bit of CR0 indicates to the 80386 whether NPX
5893 functions are to be emulated. If the CPU finds EM set when it executes an
5894 ESC instruction, program control is automatically trapped to exception #7,
5895 giving the exception handler the opportunity to emulate the functions of an
5898 For correct 80386 operation, the EM bit must never be set concurrently with
5899 MP. The EM and MP bits of the 80386 are described in more detail in the
5900 80386 Programmer's Reference Manual. More information on software
5901 emulation for the 80387 NPX is described in the "80387 Emulation" section
5902 later in this chapter. In any case, if ESC instructions are to be executed,
5903 either the MP or EM bit must be set, but not both.
5906 6.2.5 Initializing the 80387
5908 Initializing the 80387 NPX simply means placing the NPX in a known state
5909 unaffected by any activity performed earlier. A single FNINIT instruction
5910 performs this initialization. All the error masks are set, all registers are
5911 tagged empty, TOP is set to zero, and default rounding and precision
5912 controls are set. Table 6-1 shows the state of the 80387 NPX following
5913 FINIT or FNINIT. This state is compatible with that of the 80287 after
5914 FINIT or after hardware RESET.
5916 The FNINIT instruction does not leave the 80387 in the same state as that
5917 which results from the hardware RESET signal. Following a hardware RESET
5918 signal, such as after initial power-up, the state of the 80387 differs in
5919 the following respects:
5921 1. The mask bit for the invalid-operation exception is reset.
5923 2. The invalid-operation exception flag is set.
5925 3. The exception-summary bit is set (along with its mirror image, the
5928 These settings cause assertion of the ERROR# signal as described
5929 previously. The FNINIT instruction must be used to change the 80387 state to
5930 one compatible with the 80287.
5933 Table 6-1. NPX Processor State Following Initialization
5935 Field Value Interpretation
5939 The 80387 does not have infinity control. This value is listed to emphasize
5940 that programs written for the 80287 may not behave the same on the 80387 if
5941 they depend on this bit. 0 Affine
5942 Rounding Control 00 Round to nearest
5943 Precision Control 11 64 bits
5944 Exception Masks 111111 All exceptions masked
5947 Condition Code 0000 ‘‘
5948 Stack Top 000 Register 0 is stack top
5949 Exception Summary 0 No exceptions
5951 Exception Flags 000000 No exceptions
5954 Registers N.C. Not changed
5956 Instruction Code N.C. Not changed
5957 Instruction Address N.C. Not changed
5958 Operand Address N.C. Not changed
5961 6.2.6 80387 Emulation
5963 If it is determined that no 80387 NPX is available in the system, systems
5964 software may decide to emulate ESC instructions in software. This emulation
5965 is easily supported by the 80386 hardware, because the 80386 can be
5966 configured to trap to a software emulation routine whenever it encounters an
5967 ESC instruction in its instruction stream.
5969 Whenever the 80386 CPU encounters an ESC instruction, and its MP and EM
5970 status bits are set appropriately (MP=0, EM=1), the 80386 automatically
5971 traps to interrupt #7, the "processor extension not available" exception.
5972 The return link stored on the stack points to the first byte of the ESC
5973 instruction, including the prefix byte(s), if any. The exception handler can
5974 use this return link to examine the ESC instruction and proceed to emulate
5975 the numeric instruction in software.
5977 The emulator must step the return pointer so that, upon return from the
5978 exception handler, execution can resume at the first instruction following
5979 the ESC instruction.
5981 To an application program, execution on an 80386 system with 80387
5982 emulation is almost indistinguishable from execution on a system with an
5983 80387, except for the difference in execution speeds.
5985 There are several important considerations when using emulation on an 80386
5988 Ž When operating in protected mode, numeric applications using the
5989 emulator must be executed in execute-readable code segments. Numeric
5990 software cannot be emulated if it is executed in execute-only code
5991 segments. This is because the emulator must be able to examine the
5992 particular numeric instruction that caused the emulation trap.
5994 Ž Only privileged tasks can place the 80386 in emulation mode. The
5995 instructions necessary to place the 80386 in emulation mode are
5996 privileged instructions, and are not typically accessible to an
5999 An emulator package (EMUL387) that runs on 80386 systems is available from
6000 Intel. This emulation package operates in both real and protected mode as
6001 well as in virtual 8086 mode, providing a complete functional equivalent for
6002 the 80387 emulated in software.
6004 When using the EMUL387 emulator, writers of numeric exception handlers
6005 should be aware of one slight difference between the emulated 80387 and the
6008 Ž On the 80387 hardware, exception handlers are invoked by the 80386 at
6009 the first WAIT or ESC instruction following the instruction causing the
6010 exception. The return link, stored on the 80386 stack, points to this
6011 second WAIT or ESC instruction where execution will resume following a
6012 return from the exception handler.
6014 Ž Using the EMUL387 emulator, numeric exception handlers are invoked
6015 from within the emulator itself. The return link stored on the stack
6016 when the exception handler is invoked will therefore point back to the
6017 EMUL387 emulator, rather than to the program code actually being
6018 executed (emulated). An IRET return from the exception handler returns
6019 to the emulator, which then returns immediately to the emulated
6020 program. This added layer of indirection should not cause confusion,
6021 however, because the instruction causing the exception can always be
6022 identified from the 80387's instruction and data pointers.
6025 6.2.7 Handling Numerics Exceptions
6027 Once the 80387 has been initialized and normal execution of applications
6028 has been commenced, the 80387 NPX may occasionally require attention in
6029 order to recover from numeric processing exceptions. This section provides
6030 details for writing software exception handlers for numeric exceptions.
6031 Numeric processing exceptions have already been introduced in Chapter 3.
6033 The 80387 NPX can take one of two actions when it recognizes a numeric
6036 Ž If the exception is masked, the NPX will automatically perform its own
6037 masked exception response, correcting the exception condition according
6038 to fixed rules, and then continuing with its instruction execution.
6040 Ž If the exception is unmasked, the NPX signals the exception to the
6041 80386 CPU using the ERROR# status line between the two processors. Each
6042 time the 80386 encounters an ESC or WAIT instruction in its instruction
6043 stream, the CPU checks the condition of this ERROR# status line. If
6044 ERROR# is active, the CPU automatically traps to Interrupt vector #16,
6045 the Processor Extension Error trap.
6047 Interrupt vector #16 typically points to a software exception handler,
6048 which may or may not be a part of systems software. This exception handler
6049 takes the form of an 80386 interrupt procedure.
6051 When handling numeric errors, the CPU has two responsibilities:
6053 Ž The CPU must not disturb the numeric context when an error is
6056 Ž The CPU must clear the error and attempt recovery from the error.
6058 Although the manner in which programmers may treat these responsibilities
6059 varies from one implementation to the next, most exception handlers will
6060 include these basic steps:
6062 Ž Store the NPX environment (control, status, and tag words, operand and
6063 instruction pointers) as it existed at the time of the exception.
6065 Ž Clear the exception bits in the status word.
6067 Ž Enable interrupts on the CPU.
6069 Ž Identify the exception by examining the status and control words in
6070 the saved environment.
6072 Ž Take some system-dependent action to rectify the exception.
6074 Ž Return to the interrupted program and resume normal execution.
6077 6.2.8 Simultaneous Exception Response
6079 In cases where multiple exceptions arise simultaneously, the 80387 signals
6080 one exception according to the precedence shown at the end of Chapter 3.
6081 This means, for example, that an SNaN divided by zero results in an invalid
6082 operation, not in a zero divide exception.
6085 6.2.9 Exception Recovery Examples
6087 Recovery routines for NPX exceptions can take a variety of forms. They can
6088 change the arithmetic and programming rules of the NPX. These changes may
6089 redefine the default fix-up for an error, change the appearance of the NPX
6090 to the programmer, or change how arithmetic is defined on the NPX.
6092 A change to an exception response might be to automatically normalize all
6093 denormals loaded from memory. A change in appearance might be extending the
6094 register stack into memory to provide an "infinite" number of numeric
6095 registers. The arithmetic of the NPX can be changed to automatically extend
6096 the precision and range of variables when exceeded. All these functions can
6097 be implemented on the NPX via numeric exceptions and associated recovery
6098 routines in a manner transparent to the application programmer.
6100 Some other possible application-dependent actions might include:
6102 Ž Incrementing an exception counter for later display or printing
6104 Ž Printing or displaying diagnostic information (e.g., the 80387
6105 environment andregisters)
6107 Ž Aborting further execution
6109 Ž Storing a diagnostic value (a NaN) in the result and continuing with
6112 Notice that an exception may or may not constitute an error, depending on
6113 the application. Once the exception handler corrects the condition causing
6114 the exception, the floating-point instruction that caused the exception can
6115 be restarted, if appropriate. This cannot be accomplished using the IRET
6116 instruction, however, because the trap occurs at the ESC or WAIT instruction
6117 following the offending ESC instruction. The exception handler must obtain
6118 (using FSAVE or FSTENV) the address of the offending instruction in the task
6119 that initiated it, make a copy of it, execute the copy in the context of the
6120 offending task, and then return via IRET to the current CPU instruction
6123 In order to correct the condition causing the numeric exception, exception
6124 handlers must recognize the precise state of the NPX at the time the
6125 exception handler was invoked, and be able to reconstruct the state of the
6126 NPX when the exception initially occurred. To reconstruct the state of the
6127 NPX, programmers must understand when, during the execution of an NPX
6128 instruction, exceptions are actually recognized.
6130 Invalid operation, zero divide, and denormalized exceptions are detected
6131 before an operation begins, whereas overflow, underflow, and precision
6132 exceptions are not raised until a true result has been computed. When a
6133 before exception is detected, the NPX register stack and memory have
6134 not yet been updated, and appear as if the offending instructions has not
6137 When an after exception is detected, the register stack and memory appear
6138 as if the instruction has run to completion; i.e., they may be updated.
6139 (However, in a store or store-and-pop operation, unmasked over/underflow is
6140 handled like a before exception; memory is not updated and the stack is not
6141 popped.) The programming examples contained in Chapter 7 include an outline
6142 of several exception handlers to process numeric exceptions for the 80387.
6145 Chapter 7 Numeric Programming Examples
6147 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
6149 The following sections contain examples of numeric programs for the 80387
6150 NPX written in ASM386. These examples are intended to illustrate some of the
6151 techniques for programming the 80386/80387 computing system for numeric
6155 7.1 Conditional Branching Example
6157 As discussed in Chapter 2, several numeric instructions post their results
6158 to the condition code bits of the 80387 status word. Although there are many
6159 ways to implement conditional branching following a comparison, the basic
6160 approach is as follows:
6162 Ž Execute the comparison.
6164 Ž Store the status word. (80387 allows storing status directly into AX
6167 Ž Inspect the condition code bits.
6169 Ž Jump on the result.
6171 Figure 7-1 is a code fragment that illustrates how two memory-resident
6172 double-format real numbers might be compared (similar code could be used
6173 with the FTST instruction). The numbers are called A and B, and the
6174 comparison is A to B.
6176 The comparison itself requires loading A onto the top of the 80387 register
6177 stack and then comparing it to B, while popping the stack with the same
6178 instruction. The status word is then written into the 80386 AX register.
6180 A and B have four possible orderings, and bits C3, C2, and C0 of the
6181 condition code indicate which ordering holds. These bits are positioned in
6182 the upper byte of the NPX status word so as to correspond to the CPU's zero,
6183 parity, and carry flags (ZF, PF, and CF), when the byte is written into the
6184 flags. The code fragment sets ZF, PF, and CF of the CPU status word to the
6185 values of C3, C2, and C0 of the NPX status word, and then uses the CPU
6186 conditional jump instructions to test the flags. The resulting code is
6187 extremely compact, requiring only seven instructions.
6189 The FXAM instruction updates all four condition code bits. Figure 7-2 shows
6190 how a jump table can be used to determine the characteristics of the value
6191 examined. The jump table (FXAM_TBL) is initialized to contain the 32-bit
6192 displacement of 16 labels, one for each possible condition code setting.
6193 Note that four of the table entries contain the same value, "EMPTY." The
6194 first two condition code settings correspond to "EMPTY." The two other table
6195 entries that contain "EMPTY" will never be used on the 80387, but may be
6196 used if the code is executed with an 80287.
6198 The program fragment performs the FXAM and stores the status word. It then
6199 manipulates the condition code bits to finally produce a number in register
6200 BX that equals the condition code times 2. This involves zeroing the unused
6201 bits in the byte that contains the code, shifting C3 to the right so that it
6202 is adjacent to C2, and then shifting the code to multiply it by 2. The
6203 resulting value is used as an index that selects one of the displacements
6204 from FXAM_TBL (the multiplication of the condition code is required because
6205 of the 2-byte length of each value in FXAM_TBL). The unconditional JMP
6206 instruction effectively vectors through the jump table to the labeled
6207 routine that contains code (not shown in the example) to process each
6208 possible result of the FXAM instruction.
6211 Figure 7-1. Conditional Branching for Compares
6221 FLD A ; LOAD A ONTO TOP OF 387 STACK
6222 FCOMP B ; COMPARE A:B, POP A
6223 FSTSW AX ; STORE RESULT TO CPU AX REGISTER
6225 ; CPU AX REGISTER CONTAINS CONDITION CODES
6226 ; (RESULTS OF COMPARE)
6227 ; LOAD CONDITION CODES INTO CPU FLAGS
6231 ; USE CONDITIONAL JUMPS TO DETERMINE ORDERING OF A TO B
6233 JP A_B_UNORDERED ; TEST C2 (PF)
6234 JB A_LESS ; TEST C0 (CF)
6235 JE A_EQUAL ; TEST C3 (ZF)
6236 A_GREATER: ; C0 (CF) = 0, C3 (ZF) = 0
6239 A_EQUAL: ; C0 (CF) = 0, C3 (ZF) = 1
6242 A_LESS: ; C0 (CF) = 1, C3 (ZF) = 0
6245 A_B_UNORDERED: ; C2 (PF) = 1
6250 Figure 7-2. Conditional Branching for FXAM
6252 ; JUMP TABLE FOR EXAMINE ROUTINE
6254 FXAM_TBL DD POS_UNNORM, POS NAN, NEG_UNNORM, NEG_NAN,
6255 & POS_NORM, POS_INFINITY, NEG_NORM,
6256 & NEG_INFINITY, POS_ZERO, EMPTY, NEG_ZERO,
6257 & EMPTY, POS_DENORM, EMPTY, NEG_DENORM, EMPTY
6260 ; EXAMINE ST AND STORE RESULT (CONDITION CODES)
6263 XOR EAX,EAX ; CLEAR EAX
6266 ; CALCULATE OFFSET INTO JUMP TABLE
6268 AND AX,0100011100000000B ; CLEAR ALL BITS EXCEPT C3, C2-C0
6269 SHR EAX,6 ; SHIFT C2-C0 INTO PLACE (0000XXX0)
6270 SAL AH,5 ; POSITION C3 (000X0000)
6271 OR AL,AH ; DROP C3 IN ADJACENT TO C2 (000XXXX0)
6272 XOR AH,AH ; CLEAR OUT THE OLD COPY OF C3
6274 ; JUMP TO THE ROUTINE `ADDRESSED' BY CONDITION CODE
6278 ; HERE ARE THE JUMP TARGETS, ONE TO HANDLE
6279 ; EACH POSSIBLE RESULT OF FXAM
6308 7.2 Exception Handling Examples
6310 There are many approaches to writing exception handlers. One useful
6311 technique is to consider the exception handler procedure as consisting of
6312 "prologue," "body," and "epilogue" sections of code. This procedure is
6313 invoked via interrupt number 16.
6315 At the beginning of the prologue, CPU interrupts have been disabled. The
6316 prologue performs all functions that must be protected from possible
6317 interruption by higher-priority sources. Typically, this involves saving CPU
6318 registers and transferring diagnostic information from the 80387 to memory.
6319 When the critical processing has been completed, the prologue may enable CPU
6320 interrupts to allow higher-priority interrupt handlers to preempt the
6323 The body of the exception handler examines the diagnostic information and
6324 makes a response that is necessarily application-dependent. This response
6325 may range from halting execution, to displaying a message, to attempting to
6326 repair the problem and proceed with normal execution.
6328 The epilogue essentially reverses the actions of the prologue, restoring
6329 the CPU and the NPX so that normal execution can be resumed. The epilogue
6330 must not load an unmasked exception flag into the 80387 or another exception
6331 will be requested immediately.
6333 Figures 7-3 through 7-5 show the ASM386 coding of three skeleton
6334 exception handlers. They show how prologues and epilogues can be written for
6335 various situations, but provide comments indicating only where the
6336 application dependent exception handling body should be placed.
6338 Figures 7-3 and 7-4 are very similar; their only substantial difference is
6339 their choice of instructions to save and restore the 80387. The tradeoff
6340 here is between the increased diagnostic information provided by FNSAVE and
6341 the faster execution of FNSTENV. For applications that are sensitive to
6342 interrupt latency or that do not need to examine register contents, FNSTENV
6343 reduces the duration of the "critical region," during which the CPU does not
6344 recognize another interrupt request.
6346 After the exception handler body, the epilogues prepare the CPU and the NPX
6347 to resume execution from the point of interruption (i.e., the instruction
6348 following the one that generated the unmasked exception). Notice that the
6349 exception flags in the memory image that is loaded into the 80387 are
6350 cleared to zero prior to reloading (in fact, in these examples, the entire
6351 status word image is cleared).
6353 The examples in Figures 7-3 and 7-4 assume that the exception handler
6354 itself will not cause an unmasked exception. Where this is a possibility,
6355 the general approach shown in Figure 7-5 can be employed. The basic
6356 technique is to save the full 80387 state and then to load a new control
6357 word in the prologue. Note that considerable care should be taken when
6358 designing an exception handler of this type to prevent the handler from
6359 being reentered endlessly.
6362 Figure 7-3. Full-State Exception Handler
6366 ; SAVE CPU REGISTERS, ALLOCATE STACK SPACE
6367 ; FOR 80387 STATE IMAGE
6371 ; SAVE FULL 80387 STATE, ENABLE CPU INTERRUPTS
6375 ; APPLICATION-DEPENDENT EXCEPTION HANDLING
6378 ; CLEAR EXCEPTION FLAGS IN STATUS WORD
6379 ; (WHICH IS IN MEMORY)
6380 ; RESTORE MODIFIED STATE IMAGE
6381 MOV BYTE PTR [EBP-104], 0H
6383 ; DEALLOCATE STACK SPACE, RESTORE CPU REGISTERS
6389 ; RETURN TO INTERRUPTED CALCULATION
6394 Figure 7-4. Reduced-Latency Exception Handler
6396 SAVE_ENVIRONMENT PROC
6398 ; SAVE CPU REGISTERS, ALLOCATE STACK SPACE
6399 ; FOR 80387 ENVIRONMENT
6404 ; SAVE ENVIRONMENT, ENABLE CPU INTERRUPTS
6408 ; APPLICATION EXCEPTION-HANDLING CODE GOES HERE
6410 ; CLEAR EXCEPTION FLAGS IN STATUS WORD
6411 ; (WHICH IS IN MEMORY)
6412 ; RESTORE MODIFIED ENVIRONMENT IMAGE
6413 MOV BYTE PTR [EBP-24], 0H
6415 ; DE-ALLOCATE STACK SPACE, RESTORE CPU REGISTERS
6419 ; RETURN TO INTERRUPTED CALCULATION
6421 SAVE_ENVIRONMENT ENDP
6424 Figure 7-5. Reentrant Exception Handler
6429 LOCAL CONTROL DW ? ; ASSUME INITIALIZED
6435 ; SAVE CPU REGISTERS, ALLOCATE STACK SPACE FOR
6443 ; SAVE STATE, LOAD NEW CONTROL WORD,
6444 ; ENABLE CPU INTERRUPTS
6451 ; APPLICATION EXCEPTION HANDLING CODE GOES HERE.
6452 ; AN UNMASKED EXCEPTION GENERATED HERE WILL
6453 ; CAUSE THE EXCEPTION HANDLER TO BE REENTERED.
6454 ; IF LOCAL STORAGE IS NEEDED, IT MUST BE
6455 ; ALLOCATED ON THE CPU STACK.
6459 ; CLEAR EXCEPTION FLAGS IN STATUS WORD
6460 ; (WHICH IS IN MEMORY)
6461 ; RESTORE MODIFIED STATE IMAGE
6462 MOV BYTE PTR [EBP-104], 0H
6464 ; DE-ALLOCATE STACK SPACE, RESTORE CPU REGISTERS
6470 ; RETURN TO POINT OF INTERRUPTION
6475 7.3 Flaoting-Point to ASCII Conversion Examples
6477 Numeric programs must typically format their results at some point for
6478 presentation and inspection by the program user. In many cases, numeric
6479 results are formatted as ASCII strings for printing or display. This example
6480 shows how floating-point values can be converted to decimal ASCII character
6481 strings. The function shown in Figure 7-6 can be invoked from PL/M-386,
6482 Pascal-386, FORTRAN-386, or ASM386 routines.
6484 Shortness, speed, and accuracy were chosen rather than providing the
6485 maximum number of significant digits possible. An attempt is made to keep
6486 integers in their own domain to avoid unnecessary conversion errors.
6488 Using the extended precision real number format, this routine achieves a
6489 worst case accuracy of three units in the 16th decimal position for a
6490 noninteger value or integers greater than 10^(18). This is double precision
6491 accuracy. With values having decimal exponents less than 100 in magnitude,
6492 the accuracy is one unit in the 17th decimal position.
6494 Higher precision can be achieved with greater care in programming, larger
6495 program size, and lower performance.
6498 Figure 7-6. Floating-Point to ASCII Conversion Routine
6500 XENIX286 80380 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE FLOATING_TO_ASCII
6501 OBJECT MODULE PLACED IN fpasc.obj
6502 ASSEMBLER INVOKED BY: asm386 fpasc.asm
6506 1 +1 $title(`Convert a floating point number to ASCII')
6508 3 name floating_to_ascii
6510 00000000 5 public floating_to_ascii
6511 6 extrn get_power_10:near,tos_status:near
6513 8 ; This subroutine will convert the floating point
6514 9 ; number in the top of the NPX stack to an ASCII
6515 10 ; string and separate power of 10 scaling value
6516 11 ; (in binary). The maximum width of the ASCII string
6517 12 ; formed is controlled by a parameter which must be
6518 13 ; > 1. Unnormal values, denormal values, and psuedo
6519 14 ; zeroes will be correctly converted. However, unnormals
6520 15 ; and pseudo zeros are no longer supported formats on the
6521 16 ; 80387( in conformance with the IEEE floating point
6522 17 ; standard) and hence not generated internally. A
6523 18 ; returned value will indicate how many binary bits
6524 19 ; of precision were lost in an unnormal or denormal
6525 20 ; value. The magnitude (in terms of binary power)
6526 21 ; of a pseudo zero will also be indicated. Integers
6527 22 ; less than 10**18 in magnitude are accurately converted
6528 23 ; if the destination ASCII string field is wide enough
6529 24 ; to hold all the digits. Otherwise the value is converted
6530 25 ; to scientific notation.
6532 27 ; The status of the conversion is identified by the
6533 28 ; return value, it can be:
6535 30 ; 0 conversion complete, string_size is defined
6536 31 ; 1 invalid arguments
6537 32 ; 2 exact integer conversion, string_size is defined
6539 34 ; 4 + NAN (Not A Number)
6543 38 ; 8 pseudo zero found, string_size is defined
6545 40 ; The PLM/386 calling convention is:
6547 42 ; floating_to_ascii:
6548 43 ; procedure (number,denormal_ptr,string_ptr,size_ptr,
6549 44 ; field_size, power_ptr) word external;
6550 45 ; declare (denormal_ptr,string_ptr,power_ptr,size_ptr)
6552 47 ; declare field_size word,
6553 48 ; string_size based size ptr word;
6554 49 ; declare number real;
6555 50 ; declare denormal integer based denormal ptr;
6556 51 ; declare power integer based power_ptr;
6557 52 ; end floating_to_ascii:
6559 54 ; The floating point value is expected to be
6560 55 ; on the top of the NPX stack. This subroutine
6561 56 ; expects 3 free entries on the NPX stack and
6562 57 ; will pop the passed value off when done. The
6563 58 ; generated ASCII string will have a leading
6564 59 ; character either `-' or `+' indicating the sign
6565 60 ; of the value. The ASCII decimal digits will
6566 61 ; immediately follow. The numeric value of the
6567 62 ; ASCII string is (ASCII STRING.)*10**POWER. If
6568 63 ; the given number was zero, the ASCII string will
6569 64 ; contain a sign and a single zero chacter. The
6570 65 ; value string_size indicates the total length of
6571 66 ; the ASCII string including the sign character.
6572 67 ; String(0) will always hold the sign. It is
6573 68 ; possible for string size to be less than
6574 69 ; field_size. This occurs for zeroes or integer
6575 70 ; values. A pseudo zero will return a special
6576 71 ; return code. The denormal count will indicate
6577 72 ; the power of two originally associated with the
6578 73 ; value. The power of ten and ASCII string will
6579 74 ; be as if the value was an ordinary zero.
6581 76 ; This subroutine is accurate up to a maximum of
6582 77 ; 18 decimal digits for integers. Integer values
6583 78 ; will have a decimal power of zero associated
6584 79 ; with them. For non integers, the result will be
6585 80 ; accurate to within 2 decimal digits of the 16th
6586 81 ; decimal place(double precision). The exponentiate
6587 82 ; instruction is also used for scaling the value into
6588 83 ; the range acceptable for the BCD data type. The
6589 84 ; roundirg mode in effect on entry to the
6590 85 ; subroutine is used for the conversion.
6592 87 ; The following registers are not transparent:
6594 89 ; eax ebx ecx edx esi edi eflags
6597 92 ; Define the stack layout.
6599 00000000[] 94 ebp_save equ dword ptr [ebp]
6600 00000004[] 95 es_save equ ebp_save + size ebp_save
6601 00000008[] 96 return_ptr equ es_save + size es_save
6602 0000000C[] 97 power_ptr equ return_ptr + size return_ptr
6603 00000010[] 98 field_size equ power_ptr + size power_ptr
6604 00000014[] 99 size_ptr equ field_size + size field_size
6605 00000018[] 100 string_ptr equ size_ptr + size size_ptr
6606 0000001C[] 101 denormal_ptr equ string_ptr + size string_ptr
6608 0014 103 parms_size equ size power_ptr + size field_size +
6609 104 & size size_ptr + size string_ptr +
6610 105 & size denormal_ptr
6612 107 ; Define constants used
6614 109 BCD_DIGITS equ 18 ; Number of digits in bcd_value
6617 112 MINUS equ 1 ; Define return values
6618 113 NAN equ 4 ; The exact values chosen
6619 114 INFINITY equ 6 ; here are important. They must
6620 115 INDEFINITE equ 3 ; correspond to the possible return
6621 116 PSEUDO_ZERO equ 8 ; values and be in the same numeric
6622 117 INVALID equ -2 ; order as tested by the program.
6629 124 ; Define layout of temporary storage area.
6631 126 power_two equ word ptr [ebp - WORD_SIZE]
6632 127 bcd_value equ tbyte ptr power two - BCD_SIZE
6633 128 bcd_byte equ byte ptr bcd_value
6634 129 fraction equ bcd_value
6636 131 local_size equ size power_two + size bcd_value
6638 133 ; Allocate stack space for the temporaries so
6639 134 ; the stack will be big enough
6641 136 stack stackseg (local_size+6) ; Allocate stack
6642 137 ; space for locals
6644 139 code segment public er
6645 140 extrn power_table:qword
6647 142 ; Constants used by this function.
6649 144 even ; Optimize for 16 bits
6650 00000000 0A00 145 const10 dw 10 ; Adjustment value for
6653 148 ; Convert the C3,C2,C1,C0 encoding from tos_status
6654 149 ; into meaningful bit flags and values.
6656 00000002 F8 151 status_table db UNNORMAL, NAN, UNNORMAL + MINUS,
6657 00000003 04 152 & NAN + MINUS, NORMAL, INFINITY,
6658 00000004 F9 153 & NORMAL + MINUS, INFINITY + MINUS,
6659 00000005 05 154 & ZERO, INVALID, ZERO + MINUS, INVALID,
6660 00000006 00 155 & DENORMAL, INVALID, DENORMAL + MINUS, INVALID
6673 00000012 157 floating_to_ascii proc
6675 00000012 E800000000 E 159 call tos_status ; Look at status of ST(0)
6677 161 ; Get descriptor from table
6678 00000017 2E0FB68002000000 R 162 movzx eax, status_table[eax]
6679 0000001F 3CFE 163 cmp al,INVALID ; Look for empty ST(0)
6680 00000021 7527 164 jne not_empty
6682 166 ; ST(0) is empty! Return the status value.
6684 00000023 C21400 168 ret parms_size
6686 170 ; Remove infinity from stack and exit.
6688 00000026 172 found_infinity:
6689 00000026 DDD8 173 fstp st(0) ; OK to leave fstp running
6690 00000028 EB02 174 jmp short exit_proc
6692 176 ; String space is too small!
6693 177 ; Return invalid code.
6695 0000002A 179 small_string:
6696 0000002A B0FE 180 mov al,INVALID
6697 0000002C 181 exit_proc:
6698 0000002C C9 182 leave ; Restore stack setup
6699 0000002D 07 183 pop es
6700 0000002E C21400 184 ret parms_size
6702 186 ; ST(0) is NAN or indefinite. Store the
6703 187 ; value in memory and look at the fraction
6704 188 ; field to separate indefinite from an ordinary NAN.
6706 00000031 190 NAN_or_indefinite:
6707 00000031 DB7DF2 191 fstp fraction ; Remove value from stack
6708 192 ; for examination
6709 00000034 A801 193 test al,MINUS ; Look at sign bit
6710 00000036 9B 194 fwait ; Insure store is done
6711 00000037 74F3 195 jz exit_proc ; Can't be indefinite if
6714 00000039 BB000000C0 198 mov ebx,0C0000000H ; Match against upper 32
6715 199 ;bits of fraction
6717 201 ; Compare bits 63-32
6718 0000003E 2B5DF6 202 sub ebx, dword ptr fraction + 4
6720 204 ; Bits 31-0 must be zero
6721 00000041 0B5DF2 205 or ebx, dword ptr fraction
6722 00000044 75E6 206 jnz exit_proc
6724 208 ; Set return value for indefinite value
6725 00000046 B003 209 mov al,INDEFINITE
6726 00000048 EBE2 210 jmp exit_proc
6728 212 ; Allocate stack space for local variables
6729 213 ; and establish parameter addressibility.
6731 0000004A 215 not_empty:
6732 0000004A 06 216 push es ; Save working register
6733 0000004B C80C0000 217 enter local_size, 0 ; Setup stack addressing
6736 220 ; Check for enough string space
6737 0000004F 8B4D10 221 mov ecx,field size
6738 00000052 83F902 222 cmp ecx,2
6739 00000055 7CD3 223 jl small_string
6741 00000057 49 225 dec ecx ; Adjust for sign character
6743 227 ; See if string is too large for BCD
6744 00000058 83F912 228 cmp ecx,BCD_DIGITS
6745 0000005B 7605 229 jbe size_ok
6747 231 ; Else set maximum string size
6748 0000005D B912000000 232 mov ecx,BCD_DIGITS
6749 00000002 233 size_ok:
6750 00000062 3C06 234 cmp al,INFINITY ; Look for infinity
6752 236 ; Return status value for + or - inf
6753 00000064 7DC0 237 jge found_infinity
6755 00000066 3C04 239 cmp al,NAN ; Look for NAN or INDEFINITE
6756 00000068 7DC7 240 jge NAN_or_indefinite
6758 242 ; Set default return values and check that
6759 243 ; the number is normalized.
6761 0000006A D9E1 245 fabs ; Use positive value only
6762 246 ; sign bit in al has true sign of value
6763 0000006C 31D2 247 xor edx,edx ; Form 0 constant
6764 0000006E 8B7D1C 248 mov edi,denormal_ptr; Zero denormal count
6765 00000071 668917 249 mov [edi], dx
6766 00000074 8B5D0C 250 mov ebx,power_ptr ; Zero power of ten value
6767 00000077 668913 251 mov [ebx], dx
6768 0000007A 88C2 252 mov dl, al
6769 0000007C 80E201 253 and dl, 1
6770 0000007F 80C202 254 add dl, EXACT
6771 00000082 3CFC 255 cmp al,ZERO ; Test for zero
6772 00000084 0F83BC000000 256 jae convert_integer ; Ship power code if value
6774 0000008A DB7DF2 258 fstp fraction
6775 00000080 9B 259 fwait
6776 0000008E 8A45F9 260 mov al, bcd_byte + 7
6777 00000091 804DF980 261 or byte ptr bcd_byte + 7, 80h
6778 00000095 DB6DF2 262 fld fraction
6779 00000098 D9F4 263 fxtract
6780 0000009A A880 264 test al, 80h
6781 0000009C 7524 265 jnz normal_value
6783 0000009E D9E8 267 fld1
6784 000000A0 DEE9 268 fsub
6785 000000A2 D9E4 269 ftat
6786 000000A4 9BDFE0 270 fatsw ax
6787 000000A7 9E 271 sahf
6788 000000A8 7510 272 jnz set_unnormal_count
6790 274 ; Found a pseudo zero
6792 000000AA D9EC 276 fldlg2 ; Develop power of ten estimate
6793 000000AC 80C206 277 add dl, PSEUDO ZERO - EXACT
6794 000000AF DECA 278 fmulp st(2), st
6795 000000B1 D9C9 279 fxch ; Get power of ten
6796 000000B3 DF1B 280 fistp word ptr [ebx] ; Set power of ten
6797 000000B5 E98C000000 281 jmp convert_integer
6799 000000BA 283 set_unnonmal_count:
6800 000000BA D9F4 284 fxtract ; Get original fraction,
6801 285 ; now normalized
6802 000000BC D9C9 286 fxch ; Get unnormal count
6803 000000BE D9E0 287 fchs
6804 000000C0 DF1F 288 fistp word ptr [edi] ; Set unnormal count
6807 291 ; Calculate the decimal magnitude associated
6808 292 ; with this number to within one order. This
6809 293 ; error will always be inevitable due to
6810 294 ; rounding and lost precision. As a result,
6811 295 ; we will deliberately fail to consider the
6812 296 ; LOG10 of the fraction value in calculating
6813 297 ; the order. Since the fraction will always
6814 298 ; be 1 <= F < 2, its LOG10 will not change
6815 299 ; the basic accuracy of the function. To
6816 300 ; get the decimal order of magnitude, simply
6817 301 ; multiply the power of two by LOG10(2) and
6818 302 ; truncate the result to an integer.
6821 305 fstp fraction ; Save the fraction field
6823 307 fist power_two ; Save power of two
6824 308 fldlg2 ; Get LOG10(2)
6825 309 ; Power_two is now safe to use
6826 310 fmul ; Form LOG10(of exponent of number)
6827 311 fistp word ptr [ebx] ; Any rounding mode
6828 312 ; will work here
6830 314 ; Check if the magnitude of the number rules
6831 315 ; out treating it as an integer.
6833 317 ; CX has the maximum number of decimal digits
6836 320 fwait ; Wait for power_ten to be valid
6838 322 ; Get power of ten of value
6839 323 movsx si, word ptr [ebx]
6840 324 sub esi,ecx ; Form scaling factor
6841 325 ; necessary in ax
6842 326 ja adjust result ; Jump if number will not fit
6844 328 ; The number is between 1 and 10**(field size).
6845 329 ; Test if it is an integer.
6847 331 fild power_two ; Restore original number
6848 332 sub dl,NORMAL-EXACT ; Convert to exact return
6851 335 fscale ; Form full value, this
6853 337 fst st(1) ; Copy value for compare
6854 338 frndint ; Test if its an integer
6855 339 fcomp ; Compare values
6856 340 fstsw ax ; Save status
6857 341 sahf ; C3=1 implies it was
6859 343 jnz convert_integer
6861 345 fstp st(0) ; Remove non integer value
6862 346 add dl,NORMAL-EXACT ; Restore original return value
6864 348 ; Scale the number to within the range allowed
6865 349 ; by the BCD format.The scaling operation should
6866 350 ; produce a number within one decimal order of
6867 351 ; magnitude of the largest decimal number
6868 352 ; representable within the given string width.
6870 354 ; The scaling power of ten value is in si.
6872 000000F2 356 adjust_result:
6873 000000F2 8BC6 357 mov eax,esi ; Setup for pow10
6874 000000F4 668903 358 mov word ptr [ebx],ax ; Set initial power
6875 359 ; of ten return value
6876 000000F7 F7D8 360 neg eax ; Subtract one for each order of
6877 361 ; magnitude the value is scaled by
6878 000000F9 E800000000 E 362 call get_power_10 ; Scaling factor is
6880 364 ; exponent and fraction
6881 000000FE DB6DF2 365 fld fraction ; Get fraction
6882 00000101 DEC9 366 fmul ; Combine fractions
6883 00000103 8BF1 367 mov esi,ecx ; Form power of ten of
6885 00000105 C1E603 369 shl esi,3 ; BCD value to fit in
6887 00000108 DF45FC 371 fild power_two ; Combine powers of two
6888 0000010B DEC2 372 faddp st(2),st
6889 0000010D D9FD 373 fscale ; Form full value,
6890 374 ; exponent was safe
6891 0000010F DDD9 375 fstp st(1) ; Remove exponent
6893 377 ; Test the adjusted value against a table
6894 378 ; of exact powers of ten. The combined errors
6895 379 ; of the magnitude estimate and power function
6896 380 ; can result in a value one order of magnitude
6897 381 ; too small or too large to fit correctly in
6898 382 ; the BCD field. To handle this problem, pretest
6899 383 ; the adjusted value, if it is too small or
6900 384 ; large, then adjust it by ten and adjust the
6901 385 ; power of ten value.
6903 00000111 387 test_power:
6905 389 ; Compare against exact power entry. Use the next
6906 390 ; entry since cx has been decremented by one
6907 00000111 2EDC9608000000 E 391 fcom power_table[esi]+type power_table
6908 00000118 9BDFE0 392 fstsw ax ; No wait is necessary
6909 0000011B 9E 393 sahf ; If C3 = C0 = 0 then
6910 0000011C 720F 394 jb test_for_small ; too big
6912 0000011E 2EDE3500000000 R 396 fidiv const10 ; Else adjust value
6913 00000125 80E2FD 397 and dl,not EXACT ; Remove exact flag
6914 00000128 66FF03 398 inc word ptr [ebx] ; Adjust power of ten value
6915 0000012B EB17 399 jmp short in range ; Convert the value to a BCD
6917 0000012D 401 test for small:
6918 0000012D 2EDC9600000000 E 402 fcom power table[esi] ; Test relative size
6919 0000134 9BDFE0 403 fstsw ax ; No wait is necessary
6920 0000137 9E 404 sahf ; If CO = 0 then
6921 405 ; st(O) >= lower bound
6922 10000138 720A 406 jc in_range ; Convert the value to a
6925 000013A 2EDE0D00000000 R 409 fimul const10 ; Adjust value into range
6926 0000141 66FF0B 410 dec word ptr [ebx] ; Adjust power of ten value
6927 0000144 411 in_range:
6928 0000144 D9FC 412 frndint ; Form integer value
6930 414 ; Assert: 0 <= TOS <= 999,999,999,999,999,999
6931 415 ; The TOS number will be exactly representable
6932 416 ; in 18 digit BCD format.
6934 00000146 418 convert_integer:
6935 00000146 DF75F2 419 fbstp bcd_value ; Store as BCD format number
6937 421 ; while the store BCD runs, setup registers
6938 422 ; for the conversion to ASCII.
6940 00000149 BE08000000 424 mov esi,BCD_SIZE.2 ; Initial BCD index value
6941 0000014E 66B9040F 425 mov cx,0f04h ; Set shift count and mask
6942 00000152 BB01000000 426 mov ebx,1 ; Set initial size of ASCII
6943 427 ; field for sign
6944 00000157 8B7D18 428 mov edi,string_ptr ; Get address of start of
6946 0000015A 8CD8 430 mov ax,ds ; Copy ds to es
6947 0000015C 8EC0 431 mov es,ax
6948 0000015E FC 432 cld ; Set autoincrement mode
6949 0000015F B02B 433 mov al,'+' ; Clear sign field
6950 00000161 F6C201 434 test dl,MINUS ; Look for negative value
6951 00000164 7402 435 jz positive_result
6953 00000166 B02D 437 mov al,`.'
6954 00000168 438 positive_result:
6955 00000168 AA 439 stosb ; Bump string pointer
6957 00000169 80E2FE 441 and dl,not MINUS ; Turn off sign bit
6958 0000016C 9B 442 fwait ; Hait for fbstp to finish
6960 444 ; Register usage:
6961 445 ; ah: BCD byte value in use
6962 446 ; al: ASCII character value
6963 447 ; dx: Return value
6964 448 ; ch: BCD mask = 0fh
6965 449 ; cl: BCD shift count = 4
6966 450 ; bx: ASCII string field width
6967 451 ; esi: BCD field index
6968 452 ; di: ASCII string field pointer
6969 453 ; ds,es: ASCII string segment base
6971 455 ; Remove leading zeroes from the number.
6973 0000016D 457 skip_leading_zeroes:
6974 0000016D 8A6435F2 458 mov ah,bcd_byte[esi] ; Get BCD byte
6975 00000171 88E0 459 mov al,ah ; Copy value
6976 00000173 D2E8 460 shr al,cl ; Get high order digit
6977 00000175 240F 461 and al,0fh ; Set zero flag
6978 00000177 7517 462 jnz enter_odd ; Exit loop if leading
6979 463 ; non zero found
6981 00000179 88E0 465 mov al,ah ; Get BCD byte again
6982 0000017B 240F 466 and al,0fh ; Get low order digit
6983 0000017D 7519 467 jnz enter_even ; Exit loop if non zero
6986 0000017F 4E 470 dec esi ; Decrement BCD index
6987 00000180 79EB 471 jns ship_leading_zeroes
6989 473 ; The significand was all zeroes.
6991 00000182 B030 475 mov al,`O' ; Set initial zero
6992 00000184 AA 476 stosb
6993 00000185 43 477 inc ebx ; Bump string length
6994 00000186 EB17 478 jmo short exit_with_value
6996 480 ; Now expand the BCD string into digit
6997 481 ; per byte values 0-9.
6999 00000188 483 digit_loop:
7000 00000188 8A6435F2 484 mov ah,bcd_byte[esi] ; Get BCD byte
7001 0000018C 88E0 485 mov al,ah
7002 0000018E D2E8 486 shr al,cl ; Get high order digit
7003 00000190 487 enter_odd:
7004 00000190 0430 488 add al,`O' ; Convert to ASCII
7005 00000192 AA 489 stosb ; Put digit into ASCII
7007 00000193 88E0 491 mov al,ah ; Get low order digit
7008 00000195 240F 492 and al,0fh
7009 00000197 43 493 inc ebx ; Bump field size counter
7010 00000198 494 enter_even:
7011 00000198 0430 495 add al,`0' ; Convert to ASCII
7012 0000019A AA 496 stosb ; Put digit into ASCII area
7013 0000019B 43 497 inc ebx ; Bump field size counter
7014 0000019C 4E 498 dec esi ; Go to next BCD byte
7015 0000019D 79E9 499 jns digit_loop
7017 501 ; Conversion complete. Set the string
7018 502 ; size and remainder.
7020 0000019F 504 exit_with_value:
7021 0000019F 8B7D14 505 mov edi,size_ptr
7022 000001A2 66891F 506 mov word ptr [edi],bx
7023 000001A5 8BC2 507 mov eax,edx ; Set return value
7024 000001A7 E980FEFFFF 508 jmp exit_proc
7026 000001AC 510 floating_to_ascii endp
7028 -------- 512 code ends
7031 ASSEMBLY COMPLETE, NO WARNINGS, NO ERRORS.
7034 XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE_GET_POWER 10
7035 OBJECT MODULE PLACED IN power10.obj
7036 ASSEMBLER INVOKED BY: asm386 power10.asm
7040 1 +1 $title(Calculate the value of 10**ax)
7042 3 ; This subroutine will calculate the
7043 4 ; value of 10**eax. For values of
7044 5 ; 0 <= eax < 19, the result will exact.
7045 6 ; All 80386 registers are transparent
7046 7 ; and the value is returned on the TOS
7047 8 ; as two numbers, exponent in ST(1) and
7048 9 ; fraction in ST(0). The exponent value
7049 10 ; can be larger than the largest
7050 11 ; exponent of an extended real format
7051 12 ; number. Three stack entries are used.
7053 14 name get_power_10
7054 00000000 15 public get_power_10,power_table
7056 -------- 17 stack stackseg 8
7058 -------- 19 code segment public er
7060 21 ; Use exact values from 1.0 to 1e18.
7062 23 even ; Optimize 16 bit access
7063 00000000 000000000000F03F 24 power_table dq 1.0,1e1,1e2,1e3
7064 00000008 00000000000D2440
7065 00000010 0000000000005940
7066 00000018 0000000000408F40
7067 00000020 000000000088C340 25 dq 1e4,1e5,1e6,1e7
7068 00000028 00000000006AF840
7069 00000030 0000000080842E41
7070 00000038 00000000D0126341
7071 00000040 0000000084D79741 26 dq 1e8,1e9,1e10,1e11
7072 00000048 0000000065CDCD41
7073 00000050 000000205FA00242
7074 00000058 000000E876483742
7075 00000060 000000A2941A6D42 27 dq 1e12,1e13,1e14,1e15
7076 00000068 000040E59C30A242
7077 00000070 0000901EC4BCD642
7078 00000078 00003420F56B0C43
7079 00000080 0080E03779C34143 28 dq 1e16,1e17,1e18
7080 00000088 00A0D88557347643
7081 00000090 00C84E676DC1ABC3
7083 00000098 30 get_power_10 proc
7085 00000098 3D12000000 32 cmp eax,18 ; Test for 0 <= ax < 19
7086 0000009D 770B 33 ja out_of_range
7088 0000009F 2EDD04C500000000 R 35 fld power_table[eax*8]; Get exact value
7089 000000A7 D9F4 36 fxtract ; Separate power
7092 7.3.1 Function Partitioning
7094 Three separate modules implement the conversion. Most of the work of the
7095 conversion is done in the module FLOATING_TO_ASCII. The other modules are
7096 provided separately, because they have a more general use. One of them,
7097 GET_POWER_10, is also used by the ASCII to floating-point conversion
7098 routine. The other small module, TOS_STATUS, identifies what, if anything,
7099 is in the top of the numeric register stack.
7102 7.3.2 Exception Considerations
7104 Care is taken inside the function to avoid generating exceptions. Any
7105 possible numeric value is accepted. The only possible exception is
7106 insufficient space on the numeric register stack.
7108 The value passed in the numeric stack is checked for existence, type (NaN
7109 or infinity), and status (denormal, zero, sign). The string size is tested
7110 for a minimum and maximum value. If the top of the register stack is empty,
7111 or the string size is too small, the function returns with an error code.
7113 Overflow and underflow is avoided inside the function for very large or
7117 7.3.3 Special Instructions
7119 The functions demonstrate the operation of several numeric instructions,
7120 different data types, and precision control. Shown are instructions for
7121 automatic conversion to BCD, calculating the value of 10 raised to an
7122 integer value, establishing and maintaining concurrency, data
7123 synchronization, and use of directed rounding on the NPX.
7125 Without the extended precision data type and built-in exponential function,
7126 the double precision accuracy of this function could not be attained with
7127 the size and speed of the shown example.
7129 The function relies on the numeric BCD data type for conversion from binary
7130 floating-point to decimal. It is not difficult to unpack the BCD digits into
7131 separate ASCII decimal digits. The major work involves scaling the
7132 floating-point value to the comparatively limited range of BCD values. To
7133 print a 9-digit result requires accurately scaling the given value to an
7134 integer between 10^(8) and 10^(9). For example, the number +0.123456789
7135 requires a scaling factor of 10^(9) to produce the value +123456789.0, which
7136 can be stored in 9 BCD digits. The scale factor must be an exact power of
7137 10 to avoid changing any of the printed digit values.
7139 These routines should exactly convert all values exactly representable in
7140 decimal in the field size given. Integer values that fit in the given string
7141 size are not be scaled, but directly stored into the BCD form. Noninteger
7142 values exactly representable in decimal within the string size limits are
7143 also exactly converted. For example, 0.125 is exactly representable in
7144 binary or decimal. To convert this floating-point value to decimal, the
7145 scaling factor is 1000, resulting in 125. When scaling a value, the function
7146 must keep track of where the decimal point lies in the final decimal value.
7149 7.3.4 Description of Operation
7151 Converting a floating-point number to decimal ASCII takes three major
7152 steps: identifying the magnitude of the number, scaling it for the BCD data
7153 type, and converting the BCD data type to a decimal ASCII string.
7155 Identifying the magnitude of the result requires finding the value X such
7156 that the number is represented by I * 10^(X), where 1.0 ¾ I < 10.0. Scaling
7157 the number requires multiplying it by a scaling factor 10^(S), so that the
7158 result is an integer requiring no more decimal digits than provided for in
7161 Once scaled, the numeric rounding modes and BCD conversion put the number
7162 in a form easy to convert to decimal ASCII by host software.
7164 Implementing each of these three steps requires attention to detail. To
7165 begin with, not all floating-point values have a numeric meaning. Values
7166 such as infinity, indefinite, or NaN may be encountered by the conversion
7167 routine. The conversion routine should recognize these values and identify
7170 Special cases of numeric values also exist. Denormals have numeric values,
7171 but should be recognized because they indicate that precision was lost
7172 during some earlier calculations.
7174 Once it has been determined that the number has a numeric value, and it is
7175 normalized (setting appropriate denormal flags, if necessary, to indicate
7176 this to the calling program), the value must be scaled to the BCD range.
7179 7.3.5 Scaling the Value
7181 To scale the number, its magnitude must be determined. It is sufficient to
7182 calculate the magnitude to an accuracy of 1 unit, or within a factor of 10
7183 of the required value. After scaling the number, a check is made to see if
7184 the result falls in the range expected. If not, the result can be adjusted
7185 one decimal order of magnitude up or down. The adjustment test after the
7186 scaling is necessary due to inevitable inaccuracies in the scaling value.
7188 Because the magnitude estimate for the scale factor need only be close, a
7189 fast technique is used. The magnitude is estimated by multiplying the power
7190 of 2, the unbiased floating-point exponent, associated with the number by
7191 log{10}2. Rounding the result to an integer produces an estimate of
7192 sufficient accuracy. Ignoring the fraction value can introduce a maximum
7193 error of 0.32 in the result.
7195 Using the magnitude of the value and size of the number string, the scaling
7196 factor can be calculated. Calculating the scaling factor is the most
7197 inaccurate operation of the conversion process. The relation
7198 10^(X) = 2^(X * log{2}10) is used for this function. The exponentiate
7199 instruction F2XM1 is used.
7201 Due to restrictions on the range of values allowed by the F2XM1
7202 instruction, the power of 2 value is split into integer and fraction
7203 components. The relation 2^(I + F) = 2^(I) * 2^(F) allows using the FSCALE
7204 instruction to recombine the 2^(F) value, calculated through F2XM1, and the
7208 7.3.5.1 Inaccuracy in Scaling
7210 The inaccuracy in calculating the scale factor arises because of the
7211 trailing zeros placed into the fraction value of the power of two when
7212 stripping off the integer valued bits. For each integer valued bit in the
7213 power of 2 value separated from the fraction bits, one bit of precision is
7214 lost in the fraction field due to the zero fill occurring in the least
7217 Up to 14 bits may be lost in the fraction because the largest allowed
7218 floating point exponent value is 2^(14) - 1. These bits directly reduce the
7219 accuracy of the calculated scale factor, thereby reducing the accuracy of
7220 the scaled value. For numbers in the range of 10^(±30), a maximum of 8 bits
7221 of precision are lost in the scaling process.
7224 7.3.5.2 Avoiding Underflow and Overflow
7226 The fraction and exponent fields of the number are separated to avoid
7227 underflow and overflow in calculating the scaling values. For example, to
7228 scale 10^(-4932) to 10^(8) requires a scaling factor of 10^(4950), which
7229 cannot be represented by the NPX.
7231 By separating the exponent and fraction, the scaling operation involves
7232 adding the exponents separate from multiplying the fractions. The exponent
7233 arithmetic involves small integers, all easily represented by the NPX.
7236 7.3.5.3 Final Adjustments
7238 It is possible that the power function (Get_Power_10) could produce a
7239 scaling value such that it forms a scaled result larger than the ASCII field
7240 could allow. For example, scaling 9.9999999999999999 * 10^(4900) by
7241 1.00000000000000010 * 10^(-4883) produces 1.00000000000000009 * 10^(18). The
7242 scale factor is within the accuracy of the NPX and the result is within the
7243 conversion accuracy, but it cannot be represented in BCD format. This is why
7244 there is a post-scaling test on the magnitude of the result. The result can
7245 be multiplied or divided by 10, depending on whether the result was too
7246 small or too large, respectively.
7251 For maximum flexibility in output formats, the position of the decimal
7252 point is indicated by a binary integer called the power value. If the power
7253 value is zero, then the decimal point is assumed to be at the right of the
7254 rightmost digit. Power values greater than zero indicate how many trailing
7255 zeros are not shown. For each unit below zero, move the decimal point to the
7258 The last step of the conversion is storing the result in BCD and indicating
7259 where the decimal point lies. The BCD string is then unpacked into ASCII
7260 decimal characters. The ASCII sign is set corresponding to the sign of the
7264 7.4 Trigonometric Calculation Examples (Not Tested)
7266 In this example, the kinematics of a robot arm is modeled with the 4 * 4
7267 homogeneous transformation matrices proposed by Denavit and Hartenberg
7268 J. Denavit and R.S. Hartenberg, "A Kinematic Notation for Lower-Pair
7269 Mechanisms Based on Matrices," J. Applied Mechanics, June 1955, pp. 215-221.
7271 C.S. George Lee, "Robot Arm Kinematics, Dynamics, and Control," IEEE
7272 Computer, Dec. 1982..
7273 The translational and rotational relationships between adjacent links are
7274 described with these matrices using the D-H matrix method. For each link,
7275 there is a 4 * 4 homogeneous transformation matrix that represents the
7276 link's coordinate system (L{i}) at the joint (J{i}) with respect to the
7277 previous link's coordinate system (J{i-1}, L{i-1}). The following four
7278 geometric quantities completely describe the motion of any rigid joint/link
7279 pair (J{i}, L{i}), as Figure 7-7
7280 See page 7-22 in the printed version of this manual. illustrates.
7282 Ú{i} = The angular displacement of the x{i} axis from the x{i-1} axis by
7283 rotating around the z{i-1} axis (anticlockwise).
7285 d{i} = The distance from the origin of the (i-1)^(th) coordinate system
7286 along the z{i-1} axis to the x{i} axis.
7288 a{i} = The distance of the origin of the i^(th) coordinate system from
7289 the z{i-1} axis along the -x{i} axis.
7291 Ó{i} = The angular displacement of the z{i} axis from the z{i-1} about
7292 the x{i} axis (anticlockwise).
7294 The D-H transformation matrix A=^(i){i-1} for adjacent coordinate frames
7295 (from joint{i-1} to joint{i}) is calculated as follows:
7297 A^(i){i-1} = T{z,d} * T{z,Ú} * T{x,a} * T{x,Ó}
7301 T{z,d} represents a translation along the z=i-1 axis
7303 T{z,Ú} represents a rotation of angle Ú about the z=i-1 axis
7305 T{x,a} represents a translation along the x{i}axis
7307 T{x,Ó} represents a rotation of angle Ó about the x{i}axis
7309 � COS Ú{i} -COS Ó{i}SIN Ú{i} SIN Ó{i}SIN Ú{i} COS Ú{i} �
7310 A^(i){i-1} = � SIN Ú{i} COS Ó{i}COS Ú{i} -SIN Ó{i}COS Ú{i} SIN Ú{i} �
7311 � 0 SIN Ó{i} COS Ó{i} d{i} �
7314 The composite homogeneous matrix T which represents the position and
7315 orientation of the joint/link pair with respect to the base system is
7316 obtained by successively multiplying the D-H transformation matrices for
7317 adjacent coordinate frames.
7319 T^(i){0} = A^(1){0} * A^(2){1} * ... * A^(i){i-1}
7321 This example in Figure 7-8 illustrates how the transformation process can
7322 be accomplished using the 80387. The program consists of two major
7323 procedures. The first procedure TRANS_PROC is used to calculate the elements
7324 in each D-H matrix, A^(i){i-1}. The second procedure MATRIXMUL_PROC finds
7325 the product of two successive D-H matrices.
7328 Figure 7-8. Robot Arm Kinematics Example
7330 XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE TOS_STATUS
7331 OBJECT MODULE PLACED IN tos.obj
7332 ASSEMBLER INVOKED BY: asm386 tos.asm
7336 1 +1 $title(Determine TOS register contents)
7338 3 ; This subroutine will return a value
7339 4 ; from 0-15 in eax corresponding
7340 5 ; to the contents of NPX TOS. All
7341 6 ; registers are transparent and no
7342 7 ; errors are possible. The return
7343 8 ; value corresponds to c3,c2,c1,c0
7344 9 ; of FXAM instruction.
7347 00000000 12 public tos_status
7349 -------- 14 stack stackseg 6
7351 -------- 16 code segment public er
7353 00000000 18 tos_status proc
7355 00000000 D9E5 20 fxam ; Get status of TOS register
7356 00000002 9BDFE0 21 fstsw ax ; Get current status
7357 00000D05 88E0 22 mov al,ah ; Put bit 10.8 into bits 2-0
7358 00000007 2507400000 23 and eax,4007h ; Mask out bits c3,c2,c1,c0
7359 0000000C C0EC03 24 shr ah, 3 ; Put bit c3 into bit 11
7360 0000000F 08E0 25 or al,ah ; Put c3 into bit 3
7361 00000011 B400 26 mov ah,0 ; Clear return value
7364 00000014 29 tos_status endp
7366 -------- 31 code ends
7369 ASSEMBLY COMPLETE, NO WARNINGS, NO ERRORS.
7375 000000A9 C3 38 rat ; OK to leave fxtract running
7377 40 ; Calculate the value using the
7378 41 ; exponentiate instruction. The following
7379 42 ; relations are used:
7380 43 ; 10**x = 2**(log2(10)*x)
7381 44 ; 2**(I+F) = 2**I * 2**F
7382 45 ; if st(1) = I and st(0) = 2**F then
7383 46 ; fscale produces 2**(I+F)
7385 000000AA 48 out of range:
7387 000000AA D9E9 50 fld12t ; TOS = LOG2(10)
7388 000000AC C8040000 51 enter 4,0
7390 53 ; save power of 10 value, P
7391 000000B0 8945FC 54 mov [ebp-4],eax
7393 56 ; T0S,X = LOG2(10)*P = LOG2(10**P)
7394 000000B3 DA4DFC 57 fimul dword ptr [ebp-4]
7395 000000B6 D9E8 58 fld1 ; Set TOS = -1.0
7396 000000B8 D9E0 59 fchs
7397 000000BA D9C1 60 fld st(1) ; Copy power value
7399 000000BC D9FC 62 frndint ; TOS = I: -inf < I <= X
7400 63 ; where I is an integer
7401 64 ; Rounding mode does
7403 0000003E D9CA 66 fxch st(2) ; TOS = X, ST(1) = -1.0
7405 000000C0 D8E2 68 fsub st,st(2) ; T0S,F = X-I:
7406 69 ; -1.0 < TOS <= 1.0
7408 71 ; Restore orignal rounding control
7409 000000C2 58 72 pop eax
7410 000000C3 D9F0 73 f2xm1 ; TOS = 2**(F) - 1.0
7411 000000C5 C9 74 leave ; Restore stack
7412 000000C6 DEE1 75 fsubr ; Form 2**(F)
7413 000000C8 C3 76 rat ; OK to leave fsubr running
7415 000000C9 78 get_power_10 endp
7417 -------- 80 code ends
7420 ASSEMBLY COMPLETE, NO WARNINGS, NO ERRORS.
7423 XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE ROT_MATRIX_CAL
7424 OBJECT MODULE PLACED IN transx.obj
7425 ASSEMBLER INVOKED BY: asm386 transx.asm
7429 1 Name ROT_MATRIX_CAL
7433 5 ; This example illustrates the use
7434 6 ; of the 80387 floating point
7435 7 ; instructions, in particular, the
7436 8 ; FSINCOS function which gives both
7437 9 ; the SIN and COS values.
7438 10 ; The program calculates the
7439 11 ; composite matrix for base to
7440 12 ; end-effector transformation.
7442 14 ; Only the kinematics is considered in
7445 17 ; If the composite matrix mentioned above
7447 19 ; T1n = A1 x A2 x ... x An
7448 20 ; T1n is found by successively calling
7449 21 ; trans_proc and matrixmul_pro until
7450 22 ; all matrices have been exhausted.
7452 24 ; trans_proc calculates entries in each
7453 25 ; A(A1,...,An) while matrixmul_proc
7454 26 ; performs the matrix multiplication for
7455 27 ; Ai and Ai+1. matrixmul_proc in turn
7456 28 ; calls matrix_row and matrix_elem to
7457 29 ; do the multiplication.
7460 32 ; Define stack space
7462 -------- 34 trans_stack stackseg 400
7464 36 ; Define the matrix structure for
7465 37 ; 4X4 transformational matrices
7467 -------- 39 a_matrix struc
7468 00000000 40 a11 dq ?
7469 00000008 41 a12 dq ?
7470 00000010 42 a13 dq ?
7471 00000018 43 a14 dq ?
7472 00000020 44 a21 dq ?
7473 00000028 45 a22 dq ?
7474 00000030 46 a23 dq ?
7475 00000038 47 a24 dq ?
7476 00000040 48 a31 dq 0h
7477 00000048 49 a32 dq ?
7478 00000050 50 a33 dq ?
7479 00000058 51 a34 dq ?
7480 00000060 52 a41 dq 0h
7481 00000068 53 a42 dq 0h
7482 00000070 54 a43 dq 0h
7483 00000078 55 a44 dq 1h
7484 -------- 56 a_matrix ends
7486 58 ; Assume One joint in the storage
7487 59 ; allocation and hence for
7488 60 ; two sets of parameters; however,
7489 61 ; more joints are possible
7492 00000000 64 alpha_deg1 dd ?
7493 00000004 65 alpha_deg2 dd
7494 -------- 66 alp_deg ends
7496 -------- 68 tht_deg struc
7497 00000000 69 theta_deg1 dd ?
7498 00000004 70 theta_deg2 dd
7499 -------- 71 tht_deg ends
7501 -------- 73 A_array struc
7504 -------- 76 A_array ends
7506 -------- 78 D_array struc
7509 -------- 81 D_array ends
7511 83 ; trans_data is the data segment
7514 ------- 86 trans_data segment rw public
7517 00000000 ????????????????
7518 00000008 ????????????????
7519 00000010 ????????????????
7520 00000018 ????????????????
7521 00000020 ????????????????
7522 00000028 ????????????????
7523 00000030 ????????????????
7524 00000038 ????????????????
7525 00000040 0000000000000000
7526 00000048 ????????????????
7527 00000050 ????????????????
7528 00000058 ????????????????
7529 00000060 0000000000000000
7530 00000068 0000000000000000
7531 00000070 0000000000000000
7532 00000078 0100000000000000
7533 00000080 ???????????????? 89 Bmx a_matrix<>
7534 00000088 ????????????????
7535 00000090 ????????????????
7536 00000098 ????????????????
7537 000000A0 ????????????????
7538 000000A8 ????????????????
7539 000000B0 ????????????????
7540 000000B8 ????????????????
7541 000000C0 0000000000000000
7542 000000C8 ????????????????
7543 000000D0 ????????????????
7544 000000D8 ????????????????
7545 000000E0 0000000000000000
7546 000000E8 0000000000000000
7547 000000F0 0000000000000000
7548 000000F8 0100000000000000
7549 00000100 ???????????????? 90 Tmx a matrix<>
7550 00000108 ????????????????
7551 00000110 ????????????????
7552 00000118 ????????????????
7553 00000120 ????????????????
7554 00000128 ????????????????
7555 00000130 ????????????????
7556 00000138 ????????????????
7557 00000140 0000000000000000
7558 00000148 ????????????????
7559 00000150 ????????????????
7560 00000158 ????????????????
7561 00000160 0000000000000000
7562 00000168 0000000000000000
7563 00000170 0000000000000000
7564 00000178 0100000000000000
7565 00000180 ???????? 91 ALPHA_DEG alp_deg<>
7567 00000188 ???????? 92 THETA_DEG tht_deg<>
7569 00000190 ???????????????? 93 A_VECT0R A_array<>
7570 00000198 ????????????????
7571 000001A0 ???????????????? 94 D_VECT0R D_array<>
7572 000001A8 ????????????????
7573 000001B0 00000000 95 ZER0 dd 0
7574 000001B4 B4000000 96 d180 dd 180
7575 0001 97 NUM_JOIMT equ 1
7576 0004 98 NUM_ROW equ 4
7577 0004 99 NUM_CDL equ 4
7578 000001B8 01 100 REVERSE db 1h
7579 -------- 101 trans_data ends
7581 103 assume ds:trans_data, es:trans_data
7584 106 ; trans_code contains the procedures
7585 107 ; for calculating matrix elements and
7586 108 ; matrix multiplications
7588 -------- 110 trans_code segment er public
7590 112 ; create mnemonics for fsincos which is not
7591 113 ; yet available from ASM386 as of now
7593 C MACRO 115 codemacro fsincos
7597 00000000 119 trans_proc proc far
7600 122 ; Calculate alpha and theta in radians
7601 123 ; from their values in degrees
7603 00000000 D9EB 125 fldpi
7604 00000002 D835B4010000 R 126 fdiv d180
7606 128 ; Duplicate pi/180
7607 00000008 D9C0 129 fld st
7609 0000000A DC0CCD80010000 R 131 fmul qword ptr ALPHA_DEG[ecx*8]
7610 00000011 D9C9 132 fxch st(1)
7611 00000013 DC0CCD88010000 R 133 fmul qword ptr THETA_DEG[ecx*8]
7613 135 ; theta(radians) in ST and
7614 136 ; alpha(radians) in ST(1)
7616 138 ; Calculate matrix elements
7617 139 ; a11 = cos theta
7618 140 ; a12 = - cos alpha * sin theta
7619 141 ; a13 = sin alpha * sin theta
7620 142 ; a14 = A * cos theta
7621 143 ; a21 = sin theta
7622 144 ; a22 = cos alpha * cos theta
7623 145 ; a23 = -sin alpha * cos theta
7624 146 ; a24 = A * sin theta
7625 147 ; a32 = sin alpha
7626 148 ; a33 = cos alpha
7628 150 ; a31 = a41 = a42 = a43 = 0.0
7631 153 ; ebx contains the offset for the matrix
7633 0000001A D9FB 155 fsincos ;cos theta in ST
7634 156 ;sin theta in ST(1)
7635 0000001C D9C0 157 fld st ;duplicate cos theta
7636 0000001E DD13 158 fst [ebx].a11 ;cos theta in a11
7637 00000020 DC0CCD90010000 R 159 fmul qword ptr A_VECTOR[ecx*8]
7638 00000027 DD5B18 160 fstp [ebx].a14 ;A * cos theta in a14
7639 0000002A D9C9 161 fxch st(1) ;sin theta in ST
7640 0000002C DD5320 162 fst [ebx].a21 ;sin theta in a21
7641 0000002F D9C0 163 fld st ;duplicate sin theta
7642 00000031 DC0CCD90010000 R 164 fmul qword ptr A_VECTOR[ecx*8]
7643 00000038 DD5B38 165 fstp [ebx].a24 ;A * sin theta in a24
7644 0000003B D9C2 166 fld st(2) ;alpha in ST
7645 0000003D D9FB 167 fsincos ;cos alpha in ST
7646 168 ;sin alpha in ST(1)
7647 169 ;sin theta in ST(2)
7648 170 ;cos theta in ST(3)
7649 0000003F DD5350 171 fst [ebx].a33 ;cos alpha in a33
7650 00000042 D9C9 172 fxch st(1) ;sin alpha in ST
7651 00000044 DD5348 173 fat [ebx].a32 ;sin alpha in a32
7652 00000047 D9C2 174 fld ST(2) ;sin theta in ST
7653 175 ;sin alpha in ST(1)
7654 00000049 D8C9 176 fmul st,st(1) ;sin alpha * sin theta
7655 0000004B DD5B10 177 fstp [ebx].a13 ;stored in a13
7656 0000004E D8CB 178 fmul st,st(3) ;cos theta * sin alpha
7657 00000050 D9E0 179 fchs ;-cos theta * sin alpha
7658 00000052 DD5B30 180 fstp [ebx].a23 ;stored in a23
7659 00000055 D9C2 181 fld st(2) ;cos theta in ST
7660 182 ;cos alpha in ST(1)
7661 183 ;sin theta in ST(2)
7662 184 ;cos theta in ST(3)
7663 00000057 D8C9 185 fmul st,st(1) ;cos theta * cos alpha
7664 00000059 DD5B28 186 fstp [ebx].a22 ;stored in a22
7665 0000005C D8C9 187 fmul st,st(1) ;cos alpha * sin theta
7667 189 ; To take advantage of parallel operations
7668 190 ; between the CPU and NPX
7670 0000005E 50 192 push eax ; save eax
7672 194 ; also move D into a34 in a faster way
7673 0000005F 8B04CDA0010000 R 195 mov eax, dword ptr D_VECTOR[ecx*8]
7674 00000066 894358 196 mov dword ptr [ebx + 88], eax
7675 00000069 8B04CDA4010000 R 197 mov eax, dword ptr D VECTOR[ecx*8 + 4]
7676 00000070 89435C 198 mov dword ptr [ebx + 92], eax
7677 00000073 58 199 pop eax ; restore eax
7678 00000074 D9E0 200 fchs ;-cos alpha * sin theta
7679 00000076 DD5B08 201 fstp [ebx].a12 ;stored in a12
7680 202 ;and all nonzero elements
7681 203 ;have been calculated
7684 0000007A 206 trans_proc endp
7687 0000007A 209 matrix_elem proc far
7689 211 ; This procedure calculate the dot product
7690 212 ; of the ith row of the first matrix and
7691 213 ; the jth column of the second matrix:
7693 215 ; Tij where Tij = sum of Aik x Bkj over k
7695 217 ; parameters passed from the calling routine,
7699 221 ; local register, EBP = (k-1)*8
7701 0000007A 55 223 push ebp ; save ebp
7702 0000007B 51 224 push ecx ; ecx to be used as a tmp reg
7703 0000007C 8BCE 225 mov ecx, esi; save it for later indexing
7705 227 ; locating the element in the first matrix, A
7706 0000007E 6BC904 228 imul ecx, NUM_COL ; ecx contains offset due
7707 229 ; to preceding rows; the
7708 230 ; offset is from the
7709 231 ; beginning of the matrix
7711 00000081 31ED 233 xor ebp, ebp; clear ebp, which will be
7712 234 ; used a temp reg to index( k)
7713 235 ; across the ith row of the first
7714 236 ; matrix as well as down the jth
7715 237 ; column of the second matrix
7717 239 ; clear Tij for accumulating Aik*Bkj
7718 00000083 892C39 240 mov dword ptr [ecx][edi],ebp
7719 00000086 896C3904 241 mov dword ptr [ecx][edi+4], ebp
7721 0000008A 51 243 push ecx ; save on stack: esi * num_col =
7722 244 ; the offset of the beginning
7723 245 ; of the ith row from the
7724 246 ; beginning of the A matrix
7727 0000008B 01E9 249 add ecx, ebp ; get to the kth column entry
7728 250 ; of the ith row of the A matrix
7730 252 ; load AiK into 80387
7731 0000008D DD0408 253 fld qword ptr [eax][ecx]
7734 00000090 8BCD 256 mov ecx, ebp
7735 00000092 6BC904 257 imul ecx, NUM_ROW ; ecx contains the offset
7736 258 ; of the beginning of the
7737 259 ; kth row from the
7738 260 ; beginning of the B matrix
7739 00000095 01F9 261 add ecx, edi ; get to the jth column entry
7740 262 ; of the kth row of the B
7742 00000097 DC0C0B 264 fmul qword ptr [ebx][ecx]; Aik * Bkj
7743 0000009A 59 265 pop ecx ; esi * num_col
7745 0000009B 51 267 push ecx ; also at top of program
7748 270 ; add to the result in the output matrix, Tij
7749 0000009C 01F9 271 add ecx, edi
7751 273 ; accumulating the sum of Aik * Bkj
7752 0000009E DC040A 274 fadd qword ptr [edx][ecx]
7753 000000A1 DD1C0A 275 fstp qword ptr [edx][ecx]
7754 276 ; increment k by 1, i.e., ebp by 8
7755 000000A4 83C508 277 add ebp, 8
7757 279 ; Has k reached the width of the matrix yet?
7758 000000A7 83FD20 280 cmp ebp, NUM_COL*8
7759 000000AA 7CDF 281 jl NXT_k
7761 283 ; Restore registers
7762 000000AC 59 284 pop ecx ; clear esi*num_col from stack
7763 000000AD 59 285 pop ecx ; restore ecx
7764 000000AE 5D 286 pop ebp ; restore ebp
7767 000000B0 289 matrix_elem endp
7770 000000B0 292 matrix_row proc far
7772 000000B0 31FF 294 xor edi, edi
7773 295 ; scan across a row
7775 000000B2 297 NXT_COL:
7776 000000B2 9A7A000000.... R 298 call matrix_elem
7777 000000B9 83C708 299 add edi, 8
7778 000000BC 83FF20 300 cmp edi, NUM_COL*8
7779 000000BF 7CF1 301 jl NXT_COL
7782 000000C2 304 matrix_row endp
7785 000000C2 307 matrixmul_proc proc far
7787 309 ; This procedure does the matrix
7788 310 ; multiplication by calling matrix_row
7789 311 ; to calculate entries in each row
7791 313 ; The matrix multiplication is
7792 314 ; performed in the following manner,
7793 315 ; Tij = Aik x Bkj
7794 316 ; where i and j denote the row and column
7795 317 ; respectively and k is the index for
7796 318 ; scanning across the ith row of the
7797 319 ; first matrix and the jth column of the
7798 320 ; second matrix.
7799 000000C2 5A 321 pop edx ; offset Tmx in edx
7800 000000C3 5B 322 pop ebx ; offset Bmx in ebx
7801 000000C4 58 323 pop eax ; offset Amx in eax
7803 325 ; setup esi and edi
7804 326 ; edi points to the column
7805 327 ; eai points to the row
7807 000000C5 31F6 329 xor esi, esi ; clear esi
7809 000000C7 331 NXT_ROW:
7810 000000C7 9AB0000000---- R 332 call matrix_row
7811 000000CE 83C608 333 add esi, 8
7812 000000D1 83FE20 334 cmp esi, NUM_ROW*8
7813 000000D4 7CF1 335 jl NXT_ROW
7816 000000D7 338 matrixmul_proc endp
7819 -------- 341 trans_code ends
7821 343 ;***************************************
7825 347 ; Main program ;
7829 351 ;***************************************
7831 -------- 353 main_code segment er
7835 00000000 BC00000000 R 357 mov esp, stackstart trans_stack
7836 358 ; save all registers
7838 00000005 60 360 pushed
7840 362 ; ECX denotes the number of joints
7841 363 ; where no of matrices = NUM_JOINT + 1
7842 364 ; Find the first matrix( from the base
7843 365 ; of the system to the first joint)
7844 366 ; and call it Bmx
7845 00000006 31C9 367 xor ecx, ecx ; 1st matrix
7846 00000008 BB80000000 R 368 mov ebx, offset Bmx ;
7847 0000000D 9A00000000---- R 369 call trans_proc ; is Bmx
7848 00000014 41 370 inc ecx
7850 00000015 372 NXT MATRIX:
7851 373 ; From the 2nd matrix and on, it
7852 374 ; will be stored in Amx.
7853 375 ; The result from the first matrix mult.
7854 376 ; is stored in Tmx but will be accessed
7855 377 ; as Bmx in the next multiplication.
7856 378 ; As a matter of fact, the roles of Bmx
7857 379 ; and Tmx alternate in successive
7858 380 ; multiplications. This is achieved by
7859 381 ; reversing the order of the Bmx and Tmx
7860 382 ; pointers being passed onto the program
7861 383 ; stack: Thus, this is invisible to the
7862 384 ; matrix multiplication procedure.
7863 385 ; REVERSE serves as the indicator;
7864 386 ; REVERSE = 0 means that the result
7865 387 ; is to placed in Tmx.
7867 00000015 BB00000000 R 389 mov ebx, offset Amx ;find Amx
7868 0000001A 9A00000000---- R 390 call trans_proc
7869 00000021 41 391 inc ecx
7870 00000022 8035B801000001 R 392 xor REVERSE, 1h
7871 00000029 7511 393 jnz Bmx_as_Tmx
7873 395 ; no reversing. Bmx as the second input
7874 396 ; matrix while Tmx as the output matrix.
7875 0000002B 6800000000 R 397 push offset Amx
7876 00000030 6880000000 R 398 push offset Bmx
7877 00000035 6800010000 R 399 push offset Tmx
7878 0000003A EB0F 400 jmp CONTINUE
7880 402 ; reversing. Tmx as the second input
7881 403 ; matrix while Bmx as the output matrix.
7882 0000003C 404 Bmx_as_Tmx:
7883 0000003C 6800000000 R 405 push offset Amx
7884 00000041 6800010000 R 406 push offset Tmx ;reversing the
7885 00000046 6880000000 R 407 push offset Bmx ;pointers passed
7887 UUUUUU4B 409 CONTINUE:
7888 0000004B 9AC2000000---- R 410 call matrixmul_proc
7889 00000052 83F901 411 cmp ecx, NUM_JOINT
7890 00000055 7EBE 412 jle NXT_MATRIX
7892 414 ; if REVERSE = 1 then the final answer
7893 415 ; will be in Bmx otherwise, in Tmx.
7895 00000057 61 417 popad
7897 -------- 419 main_code ends
7899 421 end START, ds:trans data, ss:trans stack
7901 ASSEMBLY COMPLETE, NO WARNINGS, NO ERRORS.
7904 Appendix A Machine Instruction Encoding and Decoding
7906 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
7909 Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
7911 D8 1101 1000 MOD 000 R/M SIB, displ FADD single-real
7912 D8 1101 1000 MOD 001 R/M SIB, displ FMUL single-real
7913 D8 1101 1000 MOD 010 R/M SIB, displ FCOM single-real
7914 D8 1101 1000 MOD 011 R/M SIB, displ FCOMP single-real
7915 D8 1101 1000 MOD 100 R/M SIB, displ FSUB single-real
7916 D8 1101 1000 MOD 101 R/M SIB, displ FSUBR single-real
7917 D8 1101 1000 MOD 110 R/M SIB, displ FDIV single-real
7918 D8 1101 1000 MOD 111 R/M SIB, displ FDIVR single-real
7919 D8 1101 1000 1100 0 REG FADD ST,ST(i)
7920 D8 1101 1000 1100 1 REG FMUL ST,ST(i)
7921 D8 1101 1000 1101 0 REG FCOM ST(i)
7922 D8 1101 1000 1101 1 REG FCOMP ST(i)
7923 D8 1101 1000 1110 0 REG FSUB ST,ST(i)
7924 D8 1101 1000 1110 1 REG FSUBR ST,ST(i)
7925 D8 1101 1000 1111 0 REG FDIV ST,ST(i)
7926 D8 1101 1000 1111 1 REG FDIVR ST,ST(i)
7927 D9 1101 1001 MOD 000 R/M SIB, displ FLD single-real
7928 D9 1101 1001 MOD 001 R/M reserved
7929 D9 1101 1001 MOD 010 R/M SIB, displ FST single-real
7930 D9 1101 1001 MOD 011 R/M SIB, displ FSTP single-real
7931 D9 1101 1001 MOD 100 R/M SIB, displ FLDENV 14 or 28 bytes
7932 The size of operand transferred depends on the 80386 operand-size
7933 attribute in effect for the instruction.
7939 D9 1101 1001 MOD 101 R/M SIB, displ FLDCW 2 bytes
7940 D9 1101 1001 MOD 110 R/M SIB, displ FSTENV 14 or 28 bytes
7941 The size of operand transferred depends on the 80386 operand-size
7942 attribute in effect for the instruction.
7948 D9 1101 1001 MOD 111 R/M SIB, displ FSTCW 2 bytes
7949 D9 1101 1001 1100 0 REG FLD ST(i)
7950 D9 1101 1001 1100 1 REG FXCH ST(i)
7951 D9 1101 1001 1101 0000 FNOP
7952 D9 1101 1001 1101 0001 reserved
7953 D9 1101 1001 1101 001- reserved
7954 D9 1101 1001 1101 01-- reserved
7955 D9 1101 1001 1101 1 REG reserved
7956 D9 1101 1001 1110 0000 FCHS
7957 D9 1101 1001 1110 0001 FABS
7958 D9 1101 1001 1110 001- reserved
7959 D9 1101 1001 1110 0100 FTST
7960 D9 1101 1001 1110 0101 FXAM
7961 D9 1101 1001 1110 011- reserved
7962 D9 1101 1001 1110 1000 FLD1
7963 D9 1101 1001 1110 1001 FLDL2T
7964 D9 1101 1001 1110 1010 FLDL2E
7965 D9 1101 1001 1110 1011 FLDPI
7966 D9 1101 1001 1110 1100 FLDLG2
7967 D9 1101 1001 1110 1101 FLDLN2
7968 D9 1101 1001 1110 1110 FLDZ
7969 D9 1101 1001 1110 1111 reserved
7970 D9 1101 1001 1111 0000 F2XM1
7971 D9 1101 1001 1111 0001 FYL2X
7972 D9 1101 1001 1111 0010 FPTAN
7973 D9 1101 1001 1111 0011 FPATAN
7974 D9 1101 1001 1111 0100 FXTRACT
7975 D9 1101 1001 1111 0101 FPREM1
7976 D9 1101 1001 1111 0110 FDECSTP
7977 D9 1101 1001 1111 0111 FINCSTP
7978 D9 1101 1001 1111 1000 FPREM
7979 D9 1101 1001 1111 1001 FYL2XP1
7980 D9 1101 1001 1111 1010 FSQRT
7981 D9 1101 1001 1111 1011 FSINCOS
7982 D9 1101 1001 1111 1100 FRNDINT
7983 D9 1101 1001 1111 1101 FSCALE
7984 D9 1101 1001 1111 1110 FSIN
7985 D9 1101 1001 1111 1111 FCOS
7986 DA 1101 1010 MOD 000 R/M SIB, displ FIADD short-integer
7987 DA 1101 1010 MOD 001 R/M SIB, displ FIMUL short-integer
7988 DA 1101 1010 MOD 010 R/M SIB, displ FICOM short-integer
7989 DA 1101 1010 MOD 011 R/M SIB, displ FICOMP short-integer
7990 DA 1101 1010 MOD 100 R/M SIB, displ FISUB short-integer
7991 DA 1101 1010 MOD 101 R/M SIB, displ FISUBR short-integer
7992 DA 1101 1010 MOD 110 R/M SIB, displ FIDIV short-integer
7993 DA 1101 1010 MOD 111 R/M SIB, displ FIDIVR short-integer
7994 DA 1101 1010 110- ---- reserved
7995 DA 1101 1010 1110 0--- reserved
7996 DA 1101 1010 1110 1000 reserved
7997 DA 1010 1010 1110 1001 FUCOMPP
7998 DA 1101 1010 1110 101- reserved
7999 DA 1101 1010 1110 11-- reserved
8000 DA 1101 1010 1111 ---- reserved
8001 DB 1101 1011 MOD 000 R/M SIB, displ FILD short-integer
8002 DB 1101 1011 MOD 001 R/M SIB, displ reserved
8003 DB 1101 1011 MOD 010 R/M SIB, displ FIST short-integer
8004 DB 1101 1011 MOD 011 R/M SIB, displ FISTP short-integer
8005 DB 1101 1011 MOD 100 R/M SIB, displ reserved
8006 DB 1101 1011 MOD 101 R/M SIB, displ FLD extended-real
8007 DB 1101 1011 MOD 110 R/M SIB, displ reserved
8008 DB 1101 1011 MOD 111 R/M SIB, displ FSTP extended-real
8009 DB 1101 1011 110- ---- reserved
8010 DB 1101 1011 1110 0000
8011 This encoding can be generated by the language translators;
8012 however, the 80387 treats it as FNOP. It corresponds to the following
8013 8087 or 80287 instructions: FENI.
8019 DB 1101 1011 1110 0001
8020 This encoding can be generated by the language translators;
8021 however, the 80387 treats it as FNOP. It corresponds to the following
8022 8087 or 80287 instructions: FEDISI.
8028 DB 1101 1011 1110 0010 FCLEX
8029 DB 1101 1011 1110 0011 FINIT
8030 DB 1101 1011 1110 0100
8031 This encoding can be generated by the language translators;
8032 however, the 80387 treats it as FNOP. It corresponds to the following
8033 8087 or 80287 instructions: FSETPM.
8039 DB 1101 1011 1110 0101 reserved
8040 DB 1101 1011 1110 011- reserved
8041 DB 1101 1011 1110 1--- reserved
8042 DB 1101 1011 1111 ---- reserved
8043 DC 1101 1100 MOD 000 R/M SIB, displ FADD double-real
8044 DC 1101 1100 MOD 001 R/M SIB, displ FMUL double-real
8045 DC 1101 1100 MOD 010 R/M SIB, displ FCOM double-real
8046 DC 1101 1100 MOD 011 R/M SIB, displ FCOMP double-real
8047 DC 1101 1100 MOD 100 R/M SIB, displ FSUB double-real
8048 DC 1101 1100 MOD 101 R/M SIB, displ FSUBR double-real
8049 DC 1101 1100 MOD 110 R/M SIB, displ FDIV double-real
8050 DC 1101 1100 MOD 111 R/M SIB, displ FDIVR double-real
8051 DC 1101 1100 1100 0 REG FADD ST(i),ST
8052 DC 1101 1100 1100 1 REG FMUL ST(i),ST
8053 DC 1101 1100 1101 0 REG reserved
8054 DC 1101 100 1101 1 REG reserved
8055 DC 1101 1100 1110 0 REG FSUBR ST(i),ST
8056 DC 1101 1100 1110 1 REG FSUB ST(i),ST
8057 DC 1101 1100 1111 0 REG FDIVR ST(i),ST
8058 DC 1101 1100 1111 1 REG FDIV ST(i),ST
8059 DD 1101 1101 MOD 000 R/M SIB, displ FLD double-real
8060 DD 1101 1101 MOD 001 R/M reserved
8061 DD 1101 1101 MOD 010 R/M SIB, displ FST double-real
8062 DD 1101 1101 MOD 011 R/M SIB, displ FSTP double-real
8063 DD 1101 1101 MOD 100 R/M SIB, displ FRSTOR 94 or 108 bytes
8064 The size of operand transferred depends on the 80386 operand-size
8065 attribute in effect for the instruction.
8071 DD 1101 1101 MOD 101 R/M SIB, displ reserved
8072 DD 1101 1101 MOD 110 R/M SIB, displ FSAVE 94 or 108 bytes
8073 The size of operand transferred depends on the 80386 operand-size
8074 attribute in effect for the instruction.
8080 DD 1101 1101 MOD 111 R/M SIB, displ FSTSW 2 bytes
8081 DD 1101 1101 1100 0 REG FFREE ST(i)
8082 DD 1101 1101 1100 1 REG reserved
8083 DD 1101 1101 1101 0 REG FST ST(i)
8084 DD 1101 1101 1101 1 REG FSTP ST(i)
8085 DD 1101 1101 1110 0 REG FUCOM ST(i)
8086 DD 1101 1101 1110 1 REG FUCOMP ST(i)
8087 DD 1101 1101 1111 ---- reserved
8088 DE 1101 1110 MOD 000 R/M SIB, displ FIADD word-integer
8089 DE 1101 1110 MOD 001 R/M SIB, displ FIMUL word-integer
8090 DE 1101 1110 MOD 010 R/M SIB, displ FICOM word-integer
8091 DE 1101 1110 MOD 011 R/M SIB, displ FICOMP word-integer
8092 DE 1101 1110 MOD 100 R/M SIB, displ FISUB word-integer
8093 DE 1101 1110 MOD 101 R/M SIB, displ FISUBR word-integer
8094 DE 1101 1110 MOD 110 R/M SIB, displ FIDIV word-integer
8095 DE 1101 1110 MOD 111 R/M SIB, displ FIDIVR word-integer
8096 DE 1101 1110 1100 0 REG FADDP ST(i),ST
8097 DE 1101 1110 1100 1 REG FMULP ST(i),ST
8098 DE 1101 1110 1101 0--- reserved
8099 DE 1101 1110 1101 1000 reserved
8100 DE 1101 1110 1101 1001 FCOMPP
8101 DE 1101 1110 1101 101- reserved
8102 DE 1101 1110 1101 11-- reserved
8103 DE 1101 1110 1110 0 REG FSUBRP ST(i),ST
8104 DE 1101 1110 1110 1 REG FSUBP ST(i),ST
8105 DE 1101 1110 1111 0 REG FDIVRP ST(i),ST
8106 DE 1101 1110 1111 1 REG FDIVP ST(i),ST
8107 DF 1101 1111 MOD 000 R/M SIB, displ FILD word-integer
8108 DF 1101 1111 MOD 001 R/M SIB, displ reserved
8109 DF 1101 1111 MOD 010 R/M SIB, displ FIST word-integer
8110 DF 1101 1111 MOD 011 R/M SIB, displ FISTP word-integer
8111 DF 1101 1111 MOD 100 R/M SIB, displ FBLD packed-decimal
8112 DF 1101 1111 MOD 101 R/M SIB, displ FILD long-integer
8113 DF 1101 1111 MOD 110 R/M SIB, displ FBSTP packed-decimal
8114 DF 1101 1111 MOD 111 R/M SIB, displ FISTP long-integer
8115 DF 1101 1111 1100 0 REG reserved
8116 DF 1101 1111 1100 1 REG reserved
8117 DF 1101 1111 1101 0 REG reserved
8118 DF 1101 1111 1101 1 REG reserved
8119 DF 1101 1111 1110 0000 FSTSW AX
8120 DF 1101 1111 1110 0001 reserved
8121 DF 1101 1111 1110 001- reserved
8122 DF 1101 1111 1110 01-- reserved
8123 DF 1101 1111 1110 1--- reserved
8124 DF 1101 1111 1111 ---- reserved
8127 Appendix B Exception Summary
8129 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
8131 The following table lists the instruction mnemonics in alphabetical order.
8132 For each mnemonic, it summarizes the exceptions that the instruction may
8133 cause. When writing 80387 programs that may be used in an environment that
8134 employs numerics exception handlers, assembly-language programmers should be
8135 aware of the possible exceptions for each instruction in order to determine
8136 the need for exception synchronization. Chapter 4 explains the need for
8137 exception synchronization.
8140 Mnemonic Instruction IS
8141 IS‘‘Invalid operand due to stack overflow/underflow I
8142 I‘‘Invalid operand due to other cause D
8143 D‘‘Denormal operand Z
8147 P‘‘Inexact result (precision)
8154 F2XM1 2^(X) - 1 Y Y Y Y Y
8155 FABS Absolute value Y
8156 FADD(P) Add real Y Y Y Y Y Y
8158 FBSTP BCD store and pop Y Y Y
8160 FCLEX Clear exceptions
8161 FCOM(P)(P) Compare real Y Y Y
8162 FCOS Cosine Y Y Y Y Y
8163 FDECSTP Decrement stack pointer
8164 FDIV(R)(P) Divide real Y Y Y Y Y Y Y
8166 FIADD Integer add Y Y Y Y Y Y
8167 FICOM(P) Integer compare Y Y Y
8168 FIDIV Integer divide Y Y Y Y Y Y
8169 FIDIVR Integer divide reversed Y Y Y Y Y Y Y
8171 FIMUL Integer multiply Y Y Y Y Y Y
8172 FINCSTP Increment stack pointer
8173 FINIT Initialize processor
8174 FIST(P) Integer store Y Y Y
8175 FISUB(R) Integer subtract Y Y Y Y Y Y
8177 or stack Load real Y
8179 or double Load real Y Y Y
8181 FLDCW Load Control word Y Y Y Y Y Y Y
8182 FLDENV Load environment Y Y Y Y Y Y Y
8183 FLDL2E Load log{2}e Y
8184 FLDL2T Load log{2}10 Y
8185 FLDLG2 Load log{10}2 Y
8186 FLDLN2 Load log{e}2 Y
8189 FMUL(P) Multiply real Y Y Y Y Y Y
8191 FPATAN Partial arctangent Y Y Y Y Y
8192 FPREM Partial remainder Y Y Y Y
8193 FPREM1 IEEE partial remainder Y Y Y Y
8194 FPTAN Partial tangent Y Y Y Y Y
8195 FRNDINT Round to integer Y Y Y Y
8196 FRSTOR Restore state Y Y Y Y Y Y Y
8198 FSCALE Scale Y Y Y Y Y Y
8200 FSINCOS Sine and cosine Y Y Y Y Y
8201 FSQRT Square root Y Y Y Y
8203 or extended Store real Y
8205 or double Store real Y Y Y Y Y Y
8206 FSTCW Store control word
8207 FSTENV Store Environment
8208 FSTSW (AX) Store status word
8209 FSUB(R)(P) Subtract real Y Y Y Y Y Y
8211 FUCOM(P)(P) Unordered compare real Y Y Y
8214 FXCH Exchange registers Y
8215 FXTRACT Extract Y Y Y Y
8216 FYL2X Y * log{2}X Y Y Y Y Y Y Y
8217 FYL2XP1 Y * log{2}(X + 1) Y Y Y Y Y
8220 Appendix C Compatibility Between the 80387 and the 80287/8087
8222 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
8224 This appendix summarizes the differences between the 80387 and its
8225 predecessors the 80287 and the 8087, and analyzes the impact of these
8226 differences on software that must be transported from the 80287 or 8087 to
8227 the 80387. Any migration from the 8087 directly to the 80387 must also take
8228 into account the additional differences between the 8087 and the 80387 as
8229 listed in Appendix D of this manual.
8232 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Difference Description‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“
8233 Issue 80387 Behavior 8087/80287 Behavior Impact on Software Reason for the Difference
8235 C.1 INITIALIZATION SEQUENCE
8237 RESET, After a hardware RESET, No difference between 80387 initialization Permits the 80386 to
8238 FINIT, the ERROR# output is RESET and FINIT. software must execute an differentiate between the 80287
8239 and asserted to indicate that an FNINIT instruction to clear and the 80387.
8240 ERROR# 80387 is present. To ERROR#. The FNINIT is
8241 PIN accomplish this, the IE and not required for 80287/8087
8242 ES bits of the status word software, though Intel
8243 are set, and the IM bit in documentation
8244 the control word is reset. recommends its use (refer to
8245 After FINIT, the status the Numerics Supplement to
8246 word and the control word the iAPX 286 Programmer's
8247 have the same values as in Reference Manual.)
8251 C.2 DATA TYPES AND EXCEPTION HANDLING
8253 NaN The 80387 distinguishes The 80287/8087 only Uninitialized memory IEEE Standard 754
8254 between signaling NaNs generates one kind of NaN locations that contain compatibility.
8255 and quiet NaNs. The 80387 (the equivalent of a quiet QNaNs should be changed
8256 only generates quiet NaNs. NaN) but raises an to SNaNs to cause the
8257 An invalid-operation invalid-operation exception 80387 to fault when
8258 exception is raised only upon encountering any kind uninitialized memory locations
8259 upon encountering a of NaN. are referenced.
8260 signaling NaN (except for
8261 FCOM, FIST, and FBSTP
8262 which also raise IE for
8265 Pseudozero, The 80387 neither The 80287/8087 defines None. The 80387 does not IEEE Standard 754
8266 Pseudo-NaN, generates not supports these and supports special generate these formats, compatibility.
8267 Pseudoinfinity, formats; it raises an handling for these formats. and therefore will not
8268 and Unnormal invalid-operation exception encounter them unless a
8269 Formats whenever it encounters programmer deliberately
8270 them in an arithmetic enters them.
8273 Tag Word The encoding in the tag The encoding for pseudo- The exception handler may IEEE Standard 754
8274 Bits for word for the unsupported zero and unnormal is need to be changed if compatibility.
8275 Unsupported data formats mentioned in "valid" (type 00); the programmers use such
8276 Data Section C.2.2 is "special others are"special data" data types.
8277 Formats data" (type 10). (type 10).
8279 Invalid- No invalid-operation Upon encountering a None. Software on the Upgrade, to eliminate
8280 Operation exception is raised upon denormal in FSQRT, FDIV, 80387 will continue to exception.
8281 Exception encountering a denormal in or FPREM or upon execute in cases where the
8282 FSQRT, FDIV, or FPREM conversion to BCD or to 80287/8087 would trap.
8283 or upon conversion to integer, the invalid-
8284 BCD or to integer. The operation exception is
8285 operation proceeds by first raised.
8286 normalizing the value.
8288 Denormal The denormal exception is The denormal exception is The exception handler Performance enhancement
8289 Exception raised in transcendental not raised in transcendental needs to be changed only for normal case.
8290 instructions and FXTRACT. instructions and FXTRACT. if it gives special treatment
8291 to different opcodes.
8293 Overflow Overflow exception Overflow exception Overflow exception IEEE Standard 754
8294 Exception masked. masked. masked. compatibility.
8295 If the rounding mode is set The 80287/8087 does not Under the most common
8296 to chop (toward zero), the signal the overflow rounding modes, no
8297 result is the most positive exception when the masked impact. If rounding is
8298 or most negative number. response is not infinity; toward zero (chop), a
8299 i.e., it signals overflow program on the 80387
8300 only when the rounding produces under overflow
8301 control is not set to round conditions a result that is
8302 to zero .If rounding is set different in the least
8303 to chop (toward zero), the significant bit of the
8304 result is positive or significand, compared to
8305 negative infinity. the result on the 80287.
8307 Overflow exception not Overflow exception not Overflow exception not
8308 masked. masked. masked.
8309 The precision exception is The precision exception is If the result is stored on
8310 flagged. When the result is not flagged and the the stack, a program on
8311 stored in the stack, the significand is not rounded. the 80387 produces a
8312 significand is rounded different result under
8313 according to the precision overflow conditions than
8314 control (PC) bit of the on the 80287/8087. The
8315 control word of according difference is apparent only
8316 to the opcode. to the exception handler.
8318 Underflow Conditions for underflow. Conditions for underflow. Underflow exception IEEE Standard 754
8319 Exception When the underflow When the underflow masked. compatibility.
8320 exception is masked, the exception is masked and No impact. The underflow
8321 Two related underflow exception is rounding is toward zero, the exception occurs less
8322 events signaled when both the underflow exception flag is often when rounding is
8323 contribute to result is tiny and raised on tininess, toward zero.
8324 underflow: denormalization results regardless of loss of
8325 in a loss of accuracy. accuracy.
8327 tiny result. Response to underflow. Response to underflow. Underflow exception not
8328 A tiny When the underflow When the underflow masked.
8329 number, exception is unmasked exception is not masked and A program on the 80387
8330 because it and the instruction is the destination is the produces a different result
8331 is so small, supposed to store the stack, the significand is during underflow
8332 may cause result on the stack, the not rounded but rather is conditions than on the 80287/
8333 some other significand is rounded to left as is. 8087 if the result is
8334 exception the appropriate precision stored on the stack. The
8335 later (such (according to the precision difference is only in the
8336 as overflow control (PC) bit of the least significant bit of the
8337 upon control word, for those significand and is apparent
8338 division). instructions controlled by only to the exception handler.
8339 PC, otherwise to extended
8340 2. Loss of precision).
8367 Exception There is no difference in When the denormal None, but some unneeded Operational improvement.
8368 Precedence the precedence of the exception is not masked, normalization of denormal
8369 denormal exception, it takes precedence over operands is prevented on
8370 whether it be masked or all other exceptions. the 80387.
8373 C.3 TAG, STATUS, AND CONTROL WORDS
8375 Bits C3-C0 of After FINIT, incomplete After FINIT, incomplete None. Upgrade, to provide
8376 Status Word FPREM, and hardware FPREM, and hardware consistent state after reset.
8377 reset, the 80387 sets these reset, the 80287/8087
8378 bits to zero. leaves these bits intact
8379 (they contain the prior
8382 Bit C2 of Bit 10 (C2) serves as an This bit is undefined for None. Programs don't Upgrade to allow fast
8383 Status Word incomplete bit for FPTAN. FPTAN. check C2 after FPTAN. checking of operand range.
8386 Infinity Only affine closure is Both affine and projective Software that requires IEEE Standard 754
8387 Control supported. Bit 12 remains closures are supported. projective infinity compatibility.
8388 programmable but has no After RESET, the default arithmetic may give
8389 effect on 80387 operation. value in the control word is different results.
8392 Status Word When an invalid-operation When an invalid-operation None. Existing exception Upgrade and performance
8393 Bit 6 for exception occurs due to exception occurs due to handlers need not change, improvement.
8394 Stack Fault stack overflow or stack overflow or underflow, but may be upgraded to
8395 underflow, not only is bit 0 only bit 0 (IE) of the take advantage of the
8396 (IE) the status word set, but status word is set. Bit 6 is additional information.
8397 also bit 6 is set to indicate RESERVED. Newly written handlers will
8398 a stack fault and bit 9 (C1) be more effective.
8399 specifies overflow or
8400 underflow. Bit 6 is called
8401 SF and serves to distinguish
8402 invalid exceptions caused by
8403 stack overflow/underflow from
8404 those caused by numeric
8407 Tag Word When loading the tag word The corresponding tag is Software may not operate Performance improvement.
8408 with an FLDENV or checked before each correctly if it uses FLDENV
8409 FRSTOR instruction, the register access to determine or FRSTOR to change tags
8410 only interpretations of tag the class of operand in the to values (other than
8411 values used by the 80387 register; the tag is updated empty) that are different
8412 are empty (value 11) and after every change to a from actual register
8413 Nonempty (values 00, 01, register so that the tag contents.
8414 and 10). Subsequent always reflects the most
8415 operations on a nonempty recent status of the
8416 register always examine register. Programmers can
8417 the value in the register, load a tag with a value that
8418 not the value in its tag. disagrees with the contents
8419 The FSTENV and FSAVE of a register (for example,
8420 instructions examine the the register contains valid
8421 nonempty registers and contents, but the tag says
8422 put the correct values in special; the 80287/8087, in
8423 the tags before storing the this case, honors the tag
8424 tag word. and does not examine the
8429 FBSTP, FDIV, Operation on denormal Operation on denormal The exception handler for IEEE Standard 754
8430 FIST(P), FPREM, operand is supported. An operand raises underflow may require compatibility.
8431 FSQRT underflow exception can invalid-operation exception. change only if it gives
8432 occur. Underflow is not possible. different treatment to
8433 different opcodes. Possibly
8434 fewer invalid-operation
8435 exceptions will occur.
8437 FSCALE The range of the scaling The range of the scaling Different result when Upgrade.
8438 operand is not restricted. operand is retricted. If 0 < 0 < �ST(1)� < 1.
8439 If 0 < �ST(1)� < 1, the �ST(1)� < 1, the result is
8440 scaling factor is zero; undefined and no exception
8441 therefore, ST(0) remains is signaled.
8442 unchanged. If the rounded
8443 result is not exact or if
8445 accuracy (masked underflow),
8446 the precision exception
8449 FPREM1 Performs partial remainder Does not exist. None. IEEE Standard 754
8450 according to IEEE compatibility and upgrade.
8451 Standard 754 standard.
8453 FPREM Bits C0, C3, C1 of the The quotient bits are None. Software that works Upgrade.
8454 status word, correctly incorrect when performing a around the bug should not
8455 reflect the three low-order reduction of 64^(N) + M when be affected.
8456 bits of the quotient. N � 1 and M=1 or M=2.
8459 FUCOM, FUCOMP, Perform unordered Do not exist. None. IEEE Standard 754
8460 FUCOMPP compare according to compatibility.
8464 FPTAN Range of operand is much Range of operand is None. Upgrade.
8465 less restricted (�ST(0)� < restricted (�ST(0)� < Ò/4);
8466 2^(63)); reduces operand operand must be reduced
8467 internally using an internal to range using FPREM.
8468 Ò/4 constant that is more
8471 After a stack overflow After a stack overflow IEEE Standard 754
8472 when the invalid-operation when the invalid-operation compatibility.
8473 exception is masked, both exception is masked, the
8474 ST and ST(1) contain quiet original operand remains
8475 NaNs. unchanged, but is pushed
8478 FSIN, FCOS, Perform three common Do not exist. None. Upgrade.
8479 FSINCOS trigonometric functions.
8481 FPATAN Range of operands is �ST(0)� must be smaller None. Upgrade.
8482 unrestricted. than �ST(1)�.
8484 F2XM1 Wider range of operand The supported operand None. Upgrade.
8485 (-1 ¾ ST(0) ¾ +1). range is 0 ¾ ST(0) ¾ 0.5.
8487 FLD Does not report denormal Reports denormal exception. None. Upgrade.
8488 extended-real exception because the
8489 instruction is not arithmetic.
8491 FXTRACT If the operand is zero, the If the operand is zero, None. Software usually IEEE 754 recommendation
8492 zero-divide exception is ST(1) is zero and no bypasses zero and ý. to fully support the logb
8493 reported and ST(1) is -ý. exception is reported. If function.
8494 If the operand is +ý, no the operand is +ý, the
8495 exception is reported. invalid-operation exception
8498 FLD constant Rounding control is in Rounding control is not in Results are the same as IEEE 754 recommendation.
8499 effect. effect. for the 8087/80287 when
8500 rounding control is set to
8501 round to zero, round to
8502 -ý, and (in the case of
8503 FLDL2T) round to nearest.
8504 Results are different by
8505 one in the least significant
8506 bit of the significand in
8507 round to +ý and round to
8509 FLDL2T). FLD1 and FLDZ
8510 are always the same.
8512 FLD Loading a denormal Loading a denormal causes If the next instruction is IEEE Standard 754
8513 single/double causes the number to be the number to be converted FXTRACT or FXAM, the compatibility.
8514 precision converted to extended to an unnormal. 80387 will give a different
8515 precision (because it is put result than the 80287/8087.
8518 FLD When loading a signaling Does not raise an The exception handler IEEE Standard 754
8519 single/double NaN, raises invalid exception. exception when loading a need to be updated to compatibility.
8520 precision signaling NaN. handle this condition.
8522 FSETPM Treated as FNOP (no Informs the 80287 that the None. The 80386 handles all
8523 operation). system is in protected addressing and
8524 mode. exception-pointer information,
8525 whether in protected mode
8528 FXAM When encountering an May generate these None. Upgrade, to provide
8529 empty register, the 80387 combinations, among others. repeatable results.
8531 combinations of C3-C0 equal to
8534 All May generate different Round-up bit of status None. Upgrade, to signal
8535 Transcendental results in round-up bit of word is undefined for these rounding status.
8536 Instructions status word. instructions.
8539 Appendix D Compatibility Between the 80387 and the 8087
8541 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
8543 The 80386/80387 operating in real-address mode will execute 8087 programs
8544 without major modification. However, because of differences in the handling
8545 of numeric exceptions between the 80387 NPX and the 8087 NPX,
8546 exception-handling routines may need to be changed.
8548 This appendix summarizes the additional differences between the 80387 NPX
8549 and the 8087 NPX (other than those already included in Appendix B), and
8550 provides details showing how 8087 programs can be ported to the 80387.
8552 1. The 80387 signals exceptions through a dedicated ERROR# line to
8553 the 80386; no interrupt controller is needed for this purpose. The
8554 8087 requires an interrupt controller (8259A) to interrupt the CPU
8555 when an unmasked exception occurs. Therefore, any
8556 interrupt-controller-oriented instructions in numeric exception
8557 handlers for the 8087 should be deleted.
8559 2. The 8087 instructions FENI/FNENI and FDISI/FNDISI perform no useful
8560 function in the 80387. If the 80387 encounters one of these opcodes in
8561 its instruction stream, the instruction will effectively be
8562 ignored‘‘none of the 80387 internal states will be updated. While 8087
8563 code containing these instructions may be executed on the 80387, it
8564 is unlikely that the exception-handling routines containing these
8565 instructions will be completely portable to the 80387.
8567 3. In real mode and protected mode (not including virtual 8086 mode),
8568 interrupt vector 16 must point to the numeric exception handling
8569 routine. In virtual 8086 mode, the V86 monitor can be programmed to
8570 accommodate a different location of the interrupt vector for numeric
8573 4. The ESC instruction address saved in the 80386/80387 or 80386/80287
8574 includes any leading prefixes before the ESC opcode. The corresponding
8575 address saved in the 8086/8087 does not include leading prefixes.
8577 5. In protected mode (not including virtual 8086 mode), the format of
8578 the 80387's saved instruction and address pointers is different than
8579 for the 8087. The instruction opcode is not saved in protected
8580 mode‘‘exception handlers will have to retrieve the opcode from memory
8583 6. Interrupt 7 will occur in the 80386 when executing ESC instructions
8584 with either TS (task switched) or EM (emulation) of the 80386 MSW set
8585 (TS=1 or EM=1). If TS is set, then a WAIT instruction will also cause
8586 interrupt 7. An exception handler should be included in 80387 code to
8587 handle these situations.
8589 7. Interrupt 9 will occur if the second or subsequent words of a
8590 floating-point operand fall outside a segment's size. Interrupt 13
8591 will occur if the starting address of a numeric operand falls outside
8592 a segment's size. An exception handler should be included to report
8593 these programming errors.
8595 8. Except for the processor control instructions, all of the 80387
8596 numeric instructions are automatically synchronized by the 80386
8597 CPU‘‘the 80386 automatically waits until all operands have been
8598 transferred between the 80386 and the 80387 before executing the
8599 next ESC instruction. No explicit WAIT instructions are required to
8600 assure this synchronization. For the 8087 used with 8086 and 8088
8601 processors, explicit WAITs are required before each numeric
8602 instruction to ensure synchronization. Although 8087 programs having
8603 explicit WAIT instructions will execute perfectly on the 80387
8604 without reassembly, these WAIT instructions are unnecessary.
8606 9. Since the 80387 does not require WAIT instructions before each
8607 numeric instruction, the ASM386 assembler does not automatically
8608 generate these WAIT instructions. The ASM86 assembler, however,
8609 automatically precedes every ESC instruction with a WAIT
8610 instruction. Although numeric routines generated using the ASM86
8611 assembler will generally execute correctly on the 80386/20,
8612 reassembly using ASM386 may result in a more compact code image and
8615 The processor control instructions for the 80387 may be coded using
8616 either a WAIT or No-WAIT form of mnemonic. The WAIT forms of these
8617 instructions cause ASM386 to precede the ESC instruction with a CPU
8618 WAIT instruction, in the identical manner as does ASM86.
8620 10. The address of a memory operand stored by FSAVE or FSTENV is
8621 undefined if the previous ESC instruction did not refer to memory.
8623 11. Because the 80387 automatically normalizes denormal numbers when
8624 possible, an 8087 program that uses the denormal exception solely to
8625 normalize denormal operands can run on an 80387 by masking the
8626 denormal exception. The 8087 denormal exception handler would not be
8627 used by the 80387 in this case. A numerics program runs faster when
8628 the 80387 performs normalization of denormal operands. A program can
8629 detect at run-time whether it is running on an 80387 or 8087/80287 and
8630 disable the denormal exception when an 80387 is used.
8633 Appendix E 80387 80-Bit CHMOS III Numeric Processor Extension
8635 For Advance Information on the Intel 80387 please consult Appendix E of the
8636 printed version of this book or the 80387 Data Sheet, order number 231920.
8639 Appendix F PC/AT-Compatible 80387 Connection
8641 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
8643 The PC/AT uses a nonstandard scheme to report 80287 exceptions to the
8644 80286. When replicating the PC/AT coprocessor interface in 80386-based
8645 systems, the PC/AT interface cannot be used in exactly the same way;
8646 however, this appendix outlines a similar interface that works on
8647 80386/80387 systems and maintains compatibility with the nonstandard PC/AT
8650 Note that the interface outlined here does not represent a new interface
8651 standard; it needs to be incorporated in AT-compatible designs only because
8652 the 80286 and 80287 in the PC/AT are not connected according to the
8653 standards defined by Intel. The standard 80386/80387 connection recommended
8654 by Intel in the 80387 Data Sheet functions properly; the 80386
8655 implementation has not been and will not be altered.
8658 F.1 The PC/AT Interface
8660 In the PC/AT, the ERROR# input to the 80286 is tied inactive (high)
8661 permanently. The ERROR# output of the 80287 is tied to an interrupt port
8662 (IRQ13). This interrupt replaces exception signaling via the 80286's ERROR#
8663 input. To guarantee (in the case of an 80287 exception) that INTR 13 will be
8664 serviced prior to the execution of any further 80287 instructions, an
8665 edge-triggered flip-flop latches BUSY# using ERROR# as a clock. The output
8666 of this latch is ORed with the BUSY# output of the 80287 and drives the
8667 BUSY# input of the 80286. This PC/AT scheme effectively delays deactivation
8668 of BUSY# at the 80286 whenever an 80287 ERROR# is signaled.
8670 Since the 80286 BUSY# input remains active after an exception, the 80286
8671 interrupt 13 handler is guaranteed to execute before any other 80287
8672 instructions may begin. The interrupt 13 handler clears the BUSY# latch (via
8673 a write to a special I/O port), thus allowing execution of 80287
8674 instructions to proceed. The interrupt 13 handler then branches to the NMI
8675 handler, where the user-defined numerics exception handler resides in
8676 PC-compatible systems.
8678 The use of an interrupt guarantees that an exception from a coprocessor
8679 instruction will be detected. Latching BUSY# guarantees that any coprocessor
8680 instruction (except FINIT, FSETPM, and FCLEX) following the instruction that
8681 raised the exception will not be executed before the NMI handler is
8684 This PC/AT scheme approximates the exception reporting scheme between the
8685 8087 and 8088 in the original PC.
8688 F.2 How to Achieve the Same Effect in an 80386 System
8690 The 80386 can use a PC/AT-compatible interface to communicate with an 80387
8691 provided that, when an NPX exception occurs, BUSY# active time is extended
8692 and PEREQ is reactivated only after 80387 BUSY# has gone inactive. The 80387
8693 is left active (tying STEN high) at all times. Also, the 80386 and 80387
8694 must be reset by the same RESET signal.
8696 The reactivation of PEREQ for the 80386 is needed for store instructions
8697 (for example, FST mem) because the 80387 drops PEREQ once it signals an
8698 exception. While the 80386 has not yet recognized the occurrence of the
8699 exception, it still expects the data transfers to complete via PEREQ
8700 reactivation. It is permissible for the 80386 to receive undefined data
8701 during such I/O read cycles. Disabling the 80387 is not necessary, because
8702 the dummy data-transfer cycles directed to the 80387 when PEREQ is
8703 externally reactivated for the 80386 will not disturb the operation of the
8704 80387. The interrupt 13 handler should remove the extension of BUSY# and
8705 reactivation of PEREQ via a write to PC/AT-compatible hardware at I/O port
8709 Glossary of 80387 and Floating-Point Terminology
8711 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
8713 This glossary defines many terms that have precise technical meanings as
8714 specified in the IEEE 754 Standard or as specified in this manual. Where
8715 these terms are used, they have been italicized to emphasize the precision
8719 (1) a term used in logarithms and exponentials. In both contexts, it is a
8720 number that is being raised to a power. The two equations (y = log base b
8721 of x) and (b^(y) = x) are the same.
8724 (2) a number that defines the representation being used for a string of
8725 digits. Base 2 is the binary representation; base 10 is the decimal
8726 representation; base 16 is the hexadecimal representation. In each case,
8727 the base is the factor of increased significance for each succeeding
8728 digit (working up from the bottom).
8731 a constant that is added to the true exponent of a real number to obtain
8732 the exponent field of that number's floating-point representation in the
8733 80387. To obtain the true exponent, you must subtract the bias from the
8734 given exponent. For example, the single real format has a bias of 127
8735 whenever the given exponent is nonzero. If the 8-bit exponent field
8736 contains 10000011, which is 131, the true exponent is 131-127, or +4.
8739 the exponent as it appears in a floating-point representation of a number.
8740 The biased exponent is interpreted as an unsigned, positive number. In the
8741 above example, 131 is the biased exponent.
8743 Binary Coded Decimal
8744 a method of storing numbers that retains a base 10 representation. Each
8745 decimal digit occupies 4 full bits (one hexadecimal digit). The
8746 hexadecimal values A through F (1010 through 1111) are not used. The
8747 80387 supports a packed decimal format that consists of 9 bytes of binary
8748 coded decimal (18 decimal digits) and one sign byte.
8751 an entity just like a decimal point, except that it exists in binary
8752 numbers. Each binary digit to the right of the binary point is multiplied
8753 by an increasing negative power of two.
8756 the four "condition code" bits of the 80387 status word. These bits are
8757 set to certain values by the compare, test, examine, and remainder
8758 functions of the 80387.
8761 a term used for some non-Intel computers, meaning the exponent field of a
8762 floating-point number.
8765 to set one or more low-order bits of a real number to zero, yielding the
8766 nearest representable number in the direction of zero.
8769 the four bits of the 80387 status word that indicate the results of the
8770 compare, test, examine, and remainder functions of the 80387.
8773 a 16-bit 80387 register that the user can set, to determine the modes of
8774 computation the 80387 will use and the exception interrupts that will be
8778 a special form of floating-point number. On the 80387, a denormal is
8779 defined as a number that has a biased exponent of zero. By providing a
8780 significand with leading zeros, the range of possible negative
8781 exponents can be extended by the number of bits in the significand.
8782 Each leading zero is a bit of lost accuracy, so the extended exponent
8783 range is obtained by reducing significance.
8786 the Standard's term for the 80387's extended format, with more exponent
8787 and significand bits than the double format and an explicit integer bit
8791 a floating-point format supported by the 80387 that consists of a sign, an
8792 11-bit biased exponent, an implicit integer bit, and a 52-bit
8793 significand‘‘a total of 64 explicit bits.
8796 the 14 or 28 (depending on addressing mode) bytes of 80387 registers
8797 affected by the FSTENV and FLDENV instructions. It encompasses the entire
8798 state of the 80387, except for the 8 registers of the 80387 stack.
8799 Included are the control word, status word, tag word, and the instruction,
8800 opcode, and operand information provided by interrupts.
8803 any of the six conditions (invalid operand, denormal, numeric overflow,
8804 numeric underflow, zero-divide, and precision) detected by the 80387 that
8805 may be signaled by status flags or by traps.
8808 The data maintained by the 80386 to help exception handlers identify
8809 the cause of an exception. This data consists of a pointer to the most
8810 recently executed ESC instruction and a pointer to the memory operand of
8811 this instruction, if it had a memory operand. An exception handler can use
8812 the FSTENV and FSAVE instructions to access these pointers.
8815 (1) any number that indicates the power to which another number is raised.
8818 (2) the field of a floating-point number that indicates the magnitude of
8819 the number. This would fall under the above more general definition (1),
8820 except that a bias sometimes needs to be subtracted to obtain the correct
8824 the 80387's implementation of the Standard's double extended format.
8825 Extended format is the main floating-point format used by the 80387.
8826 It consists of a sign, a 15-bit biased exponent, and a significand with an
8827 explicit integer bit and 63 fractional-part bits.
8830 of or pertaining to a number that is expressed as base, a sign, a
8831 significand, and a signed exponent. The value of the number is the signed
8832 product of its significand and the base raised to the power of the
8833 exponent. Floating-point representations are more versatile than integer
8834 representations in two ways. First, they include fractions. Second, their
8835 exponent parts allow a much wider range of magnitude than possible with
8836 fixed-length integer representations.
8839 a method of handling the underflow error condition that minimizes the loss
8840 of accuracy in the result. If there is a denormal number that represents
8841 the correct result, that denormal is returned. Thus, digits are lost only
8842 to the extent of denormalization. Most computers return zero when
8843 underflow occurs, losing all significant digits.
8845 Implicit Integer Bit
8846 a part of the significand in the single real and double real formats
8847 that is not explicitly given. In these formats, the entire given
8848 significand is considered to be to the right of the binary point. A single
8849 implicit integer bit to the left of the binary point is always one, except
8850 in one case. When the exponent is the minimum (biased exponent is zero),
8851 the implicit integer bit is zero.
8854 a special value that is returned by functions when the inputs are such
8855 that no other sensible answer is possible. For each floating-point format
8856 there exists one quiet NaN that is designated as the indefinite value. For
8857 binary integer formats, the negative number furthest from zero is often
8858 considered the indefinite value. For the 80387 packed decimal format, the
8859 indefinite value contains all 1's in the sign byte and the uppermost
8863 The Standard's term for the 80387's precision exception.
8866 a value that has greater magnitude than any integer or any real number. It
8867 is often useful to consider infinity as another number, subject to special
8868 rules of arithmetic. All three Intel floating-point formats provide
8869 representations for +ý and -ý.
8872 a number (positive, negative, or zero) that is finite and has no
8873 fractional part. Integer can also mean the computer representation for
8874 such a number: a sequence of data bytes, interpreted in a standard way. It
8875 is perfectly reasonable for integers to be represented in a floating-point
8876 format; this is what the 80387 does whenever an integer is pushed onto the
8880 a part of the significand in floating-point formats. In these formats, the
8881 integer bit is the only part of the significand considered to be to the
8882 left of the binary point. The integer bit is always one, except in one
8883 case: when the exponent is the minimum (biased exponent is zero), the
8884 integer bit is zero. In the extended format the integer bit is explicit;
8885 in the single format and double format the integer bit is implicit; i.e.,
8886 it is not actually stored in memory.
8889 the exception condition for the 80387 that covers all cases not covered by
8890 other exceptions. Included are 80387 stack overflow and underflow, NaN
8891 inputs, illegal infinite inputs, out-of-range inputs, and inputs in
8892 unsupported formats.
8895 an integer format supported by the 80387 that consists of a 64-bit two's
8896 complement quantity.
8899 an older term for the 80387's 64-bit double format.
8902 a term used with some non-Intel computers for the significand of a
8903 floating-point number.
8906 a term that applies to each of the six 80387 exceptions I,D,Z,O,U,P. An
8907 exception is masked if a corresponding bit in the 80387 control word is
8908 set to one. If an exception is masked, the 80387 will not generate an
8909 interrupt when the exception condition occurs; it will instead provide its
8910 own exception recovery.
8913 One of the status word fields "rounding control" and "precision control"
8914 which programs can set, sense, save, and restore to control the execution
8915 of subsequent arithmetic operations.
8918 an abbreviation for "Not a Number"; a floating-point quantity that does
8919 not represent any numeric or infinite quantity. NaNs should be returned
8920 by functions that encounter serious errors. If created during a sequence
8921 of calculations, they are transmitted to the final answer and can contain
8922 information about where the error occurred.
8925 the representation of a number in a floating-point format in which the
8926 significand has an integer bit one (either explicit or implicit).
8929 convert a denormal representation of a number to a normal representation.
8932 Numeric Processor Extension. This is the 80387, 80287, or 8087.
8935 an exception condition in which the correct answer is finite, but has
8936 magnitude too great to be represented in the destination format. This kind
8937 of overflow (also called numeric overflow) is not to be confused with
8941 an integer format supported by the 80387. A packed decimal number is a
8942 10-byte quantity, with nine bytes of 18 binary coded decimal digits and
8943 one byte for the sign.
8946 to remove from a stack the last item that was placed on the stack.
8949 The effective number of bits in the significand of the floating-point
8950 representation of a number.
8953 an option, programmed through the 80387 control word, that allows all
8954 80387 arithmetic to be performed with reduced precision. Because no
8955 speed advantage results from this option, its only use is for strict
8956 compatibility with the standard and with other computer systems.
8959 an 80387 exception condition that results when a calculation does not
8960 return an exact answer. This exception is usually masked and ignored; it
8961 is used only in extremely critical applications, when the user must know
8962 if the results are exact. The precision exception is called inexact
8966 one of a set of special values of the extended real format. The set
8967 consists of numbers with a zero significand and an exponent that is
8968 neither all zeros nor all ones. Pseudozeros are not created by the 80387
8969 but are handled correctly when encountered as operands.
8972 a NaN in which the most significant bit of the fractional part of the
8973 significand is one. By convention, these NaNs can undergo certain
8974 operations without causing anexception.
8977 any finite value (negative, positive, or zero) that can be represented by
8978 a (possibly infinite) decimal expansion. Reals can be represented as the
8979 points of a line marked off like a ruler. The term real can also refer
8980 to a floating-point number that represents a real value.
8983 an integer format supported by the 80387 that consists of a 32-bit two's
8984 complement quantity. short integer is not the shortest 80387 integer
8985 format‘‘the 16-bit word integer is.
8988 an older term for the 80387's 32-bit single format.
8991 a NaN that causes an invalid-operation exception whenever it enters into
8992 a calculation or comparison, even a nonordered comparison.
8995 the part of a floating-point number that consists of the most significant
8996 nonzero bits of the number, if the number were written out in an unlimited
8997 binary format. The significand is composed of an integer bit and a
8998 fraction. The integer bit is implicit in the single format and double
8999 format. The significand is considered to have a binary point after the
9000 integer bit; the binary point is then moved according to the value of the
9004 a floating-point format, required by the standard, that provides greater
9005 precision than single; it also provides an explicit integer bit in the
9006 significand. The 80387's extended format meets the single extended
9007 requirement as well as the double extended requirement.
9010 a floating-point format supported by the 80387, which consists of a sign,
9011 an 8-bit biased exponent, an implicit integer bit, and a 23-bit
9012 significand‘‘a total of 32 explicit bits.
9015 a special case of the invalid-operation exception which is indicated by a
9016 one in the SF bit of the status word. This condition usually results from
9017 stack underflow or overflow.
9020 "IEEE Standard for Binary Floating-Point Arithmetic," ANSI/IEEE
9024 A 16-bit 80387 register that can be manually set, but which is usually
9025 controlled by side effects to 80387 instructions. It contains condition
9026 codes, the 80387 stack pointer, busy and interrupt bits, and exception
9030 a 16-bit 80387 register that is automatically maintained by the 80387. For
9031 each space in the 80387 stack, it tells if the space is occupied by a
9032 number; if so, it gives information about what kind of number.
9035 an older term for the 80387's 80-bit extended format.
9038 of or pertaining to a floating-point number that is so close to zero that
9039 its exponent is smaller than smallest exponent that can be represented in
9040 the destination format.
9043 The three-bit field of the status word that indicates which 80387 register
9044 is the current top of stack.
9047 one of a class of functions for which polynomial formulas are always
9048 approximate, never exact for more than isolated values. The 80387 supports
9049 trigonometric, exponential, and logarithmic functions; all are
9053 a method of representing integers. If the uppermost bit is zero, the
9054 number is considered positive, with the value given by the rest of the
9055 bits. If the uppermost bit is one, the number is negative, with the value
9056 obtained by subtracting (2^(bit count)) from all the given bits. For
9057 example, the 8-bit number 11111100 is -4, obtained by subtracting 2^(8)
9061 the true value that tells how far and in which direction to move the
9062 binary point of the significand of a floating-point number. For example,
9063 if a single-format exponent is 131, we subtract the Bias 127 to obtain the
9064 unbiased exponent +4. Thus, the real number being represented is the
9065 significand with the binary point shifted 4 bits to the right.
9068 an exception condition in which the correct answer is nonzero, but has a
9069 magnitude too small to be represented as a normal number in the
9070 destination floating-point format. The Standard specifies that an attempt
9071 be made to represent the number as a denormal. This denormalization may
9072 result in a loss of significant bits from the significand. This kind of
9073 underflow (also called numeric overflow) is not to be confused with stack
9077 a term that applies to each of the six 80387 exceptions: I,D,Z,O,U,P. An
9078 exception is unmasked if a corresponding bit in the 80387 control word is
9079 set to zero. If an exception is unmasked, the 80387 will generate an
9080 interrupt when the exception condition occurs. You can provide an
9081 interrupt routine that customizes your exception recovery.
9084 a extended real representation in which the explicit integer bit of the
9085 significand is zero and the exponent is nonzero. Unnormal values are
9086 not supported by the 80387; they cause the invalid-operation exception
9087 when encountered as operands.
9090 Any number representation that is not recognized by the 80387. This
9091 includes several formats that are recognized by the 8087 and 80287;
9092 namely: pseudo-NaN, pseudoinfinity, and unnormal.
9095 an integer format supported by both the 80386 and the 80387 that consists
9096 of a 16-bit two's complement quantity.
9099 an exception condition in which the inputs are finite, but the correct
9100 answer, even with an unlimited exponent, has infinite magnitude.